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Preface 


This book is actually an expanded version of my first book Introduction to 
Special Theory of Relativity published by Allied Publishers in 1998. The 
reason I wanted to write another book on the same subject is that I 
published several articles and papers over the last two decades in the 
journal Physics Education (on-line since 2013) brought out by the Indian 
Association of Physics Teachers. These articles were intended to give 
students and teachers a better grip of some key concepts associated with 
Relativity, both Special and General. I have incorporated these articles in 
this book by merging them within its chapters and sections. The reader will 
find the original articles in the Bibliography. Their references are shown in 
the next paragraph in square brackets [ ]. My treatments of these concepts 
are supposed to give additional strength to this book, and constitute a new 
feature of this book. 

I mention some of these features here: An elaborate introduction to 
tensors and stress tensors; Maxwell’s stress tensor and conservation of 
momentum in electromagnetic fields [17]; graphical construction of Lorentz 
transformation, of time dilation, length contraction and simultaneity 
paradox [41]; magnetism as a relativistic effect [35]; Minkowski’s equation 
of motion illustrated by a mathematical treatment of relativistic rocket [27]; 
energy tensor for the electromagnetic field, and of a system of incoherent 
charged dust creating its own electromagnetic field [39]. 

Einstein’s two original papers that gave the world the Special Theory of 
Relativity, and how the well-known phenomenon of electromagnetic 
induction led him to this theory, have been detailed in my article 1905 
Relativity Papers of Einstein [40]. How Minkowski’s further pursuit 
resulted in a four-dimensional geometrical world-view of events, and 
subsequently of the dynamical variables of mechanics, and of 
electrodynamics, has been detailed in another of my article Minkowski’s 
Space-Time [41]. These two supplementary resource materials are not a part 


of this book, but they can be downloaded from my website, cited at the end 
of this Preface. 

This book is comprised of four parts divided into twelve chapters. Part I 
starts with a story of Special and General Relativity in brief, and is intended 
for a layman (Chapter 1). I have outlined the basic tenets of Special 
Relativity (Chapter 2), followed by Lorentz transformation (Chapter 3) and 
relativistic mechanics (Chapter 4). I have worked out many innovative 
problems and exercises. Some of them were designed to remove any doubt 
centred on relativity paradoxes, like simultaneity, time dilation, and length 
contraction. In fact my derivation of Lorentz transformation is itself an 
exercise in this direction. The detailed worked-out problems on Lorentz 
transformation and relativistic mechanics are intended to strengthen the 
reader’s understanding. It is hoped that an undergraduate student studying 
physics will not only understand and enjoy this part thoroughly but also 
derive all essential knowledge about Special Relativity from it. 

The rest of this book is intended for a reader seeking advanced 
knowledge, in particular the covariant language, in which advanced texts in 
Classical Electrodynamics are written. This is where its wider reach is 
discovered and the journey to General Relativity begins. 

The entire Part II (Chapters 5 and 6) is an introduction to tensors, with 
special emphasis on the stress tensor. The reader will discover to his 
amazement and enlightenment that empty space loaded with an 
electromagnetic field has stresses developing within it, just like a beam 
loaded with bricks, or any other ordinary matter, like solids and fluids, 
subjected to external forces. The special name for this stress is Maxwell’s 
stress tensor, and constitutes an important ladder to the energy tensor which 
I have explained in Part IV. 

Part III takes the reader into the four-dimensional world of Minkowski’s 
space-time. The introduction to tensors initiated in Part II, where it was 
restricted to three dimensions, now begins to give rich dividends in four 
dimensions, with terms like contravariant, covariant, raising and lowering 
of indices, and an introduction to the all-important metric tensor (Chapter 
7). 

But what is the need for this esoteric trip to four dimensions? Because 
without it, it would be impossible to look at relativistic mechanics in a 
holistic manner. Without it how would one explain that energy and 
momentum together form one unit, the 4-momentum (for which I have 


coined a new name En-Mentum, to remind the reader of its four components 
and their sequence), and transforms as one unit under Lorentz 
Transformation; or 4-force (Pow-Force)? I have shown Lorentz 
transformation of a 4-vector, and specialized it to En-Mentum and Pow- 
Force and explained their significance and corollaries. Then I worked out 
Minkowski’s equation of motion (EoM), and conservation of energy and 
momentum as a single law of physics (Chapter 8). The Minkowski EoM is 
best illustrated by its application to relativistic rocket (Chapter 9), and the 
Lorentz transformation of 4-force by magnetism as a relativistic effect 
(Chapter 10). The Principle of Covariance, for which the covariant 
equations of electrodynamics stand as a shining example, is the culmination 
of this trip (Chapter 11). 

Pat IV, titled Physics of a Relativistic Continua, has only one very 
important chapter, namely Chapter 12, dedicated to the energy tensor. We 
take a brief look at the non-relativistic EoM of a perfect fluid, known as 
Euler’s equation, and extend the lessons to relativistic perfect fluid, which, 
in combination with the energy conservation of electromagnetic field, 
completes the energy tensor. 

Writing the former edition of this book, and its expansion to this version 
was a major challenge with manifold obstacles in the way. I owed it to the 
educational values inculcated at the hallowed precincts of the University of 
Illinois at Urbana-Champaign, and to the professional ethics of my great 
teachers, to face these challenges with determination in order to “present a 
true account” of the gifts I have inherited from my Alma Mater. My books 
on Special Relativity and Mechanics are a presentation of this account in a 
humble way. With deepest reverence and respect I recall some of my 
mentors: Professor Dillon Mapother (who was instrumental in my changing 
over from civil engineering to physics), Professor James H. Smith (from 
whom I learnt Special Theory of Relativity), Professor O. Hanson, Professor 
G. C. McVittie (legendary authority in General Relativity who taught me an 
introductory course on this subject), Professor Yavin, and Professor Peter 
Axel, to name a few. 

The memory of Professor James Allen, my research advisor, is etched 
permanently on my mind. He stood by me during some of my most trying 
times. To me he was the most shining example of love and kindness. 

There were two other persons who left their indelible footprints on the 
seashore of my life: Dr. Pratap Chandra Chunder, former Minister of 


Human Resource Development, Government of India, and Dr. Rais Ahmed, 
former Director of the NCERT. 


Fig. 1. Alma Mater Statue at the University of Illinios 


The entire typesetting of this book, up to its last details, plotting of 
graphs and drawings were done by me, under the operating system Linux 
Mint 17.2. I used 


The document preparation system E“IpX2- for typesetting texts and 
equations, however difficult and complex they might appear before 
our eyes, 

Kile 2.1 for making the typesetting and editing operations easy, 
Gnuplot for preparing plots and graphs of mathematical equations, 
however difficult they may be, 

Maxima for evaluating most difficult integrals, which are impossible 
to perform manually, 

Xfig for the drawings and integrating all plotted graphs in the 
drawings, and 


e GIMP for integrating .jpg images into this book and into my 
previous book Mechanics. 


I bow my head in humble respect for those who gave their precious 
times to write these programs for the students and teachers of the world, 
and distributed them free of cost. 

I got my first tutorials in Linux, and Xfig from my first son-in-law 
Michael Murphy way back in 1995, and guidance in the use of Gnuplot and 
Maxima from my second son-in-law G. R. Santhosh. 

There were times when I was at my wit’s end, whether in understanding 
certain aspects of Relativity or in overcoming typesetting problems under 
ETFX 22. I rushed to Professor A. V. Gopala Rao, himself a relativist of great 
repute, who gave his time liberally to help me out of the difficulties. 

Professor Ashok Singal, Department of Astrophysics at Physical 
Research Lab, Ahmedabad, read my manuscript patiently and corrected the 
errors and mistakes that came to his notice. Without his watchful eyes some 
embarrassing errors would have gone into my book undetected. 

The final formatting of my manuscript to fit into the specified page size, 
has been done so meticulously by Mr. Kah-Fee Ng, Senior Editor, World 
Scientific Publishing Company. My special thanks to him. 

I shall mention a few more names: My former students Lakshmi 
Narayanan and B. Rajeswari who helped me in publishing this book; Mr. K. 
S. Venkatesh, CEO, QDP Technologies from whom I received all help to 
sort out my computer problems. 

I am grateful to Dr. H. Basavana Gowda, Cardiologist and Principal JSS 
Medical College, Mysuru, Dr. B. S. Ramesh, Radiation Oncologist at HCG 
Hospital, Bangalore, Dr. K. G. Srinivas, Medical Oncologist at Bharat 
Hospital and Institute of Oncology, Mysuru for their medical guidance 
during my most difficult and trying times. I am indebted to Dr. Shankara 
Narayana Jois, my Yoga Guru for guiding me to a healthier life and to Dr. 
K. V. Ravishankar of Usha Kiran Eye Hospital, Mysuru for helping my wife 
and me protect and maintain our vision as we age. 

My granddaughter Barsha Manjari Kush has been the inspiration behind 
all my creative works in physics and music. My wife Aloka, and my two 
daughters Anuradha and Madhusmita propped up my sinking spirits when I 
lost hopes of completing my multi-dimensional projects. 


I have been enjoying Fuller Fund Membership of the American 
Association of Physics Teachers since October 2001. This allowed me 
access to the American Journal of Physics, crucial for writing papers in 
Classical Electrodynamics and Special Relativity. I thank Prof. Rogers 
Fuller, Associate Director of Membership, AAPT, and Harold Q and 
Charlotte Mae Fuller for this precious gift. 

I shall conclude by citing my website 


http://sites.google.com/site/physicsforpleasure 


where the readers will find many of my physics experiments, papers and 
articles (published or unpublished), and my music projects. 

I conclude by wishing the readers an enjoyable experience in going 
through this book. 


Somnath Datta 
Mysuru, November 30, 2020 
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Part I 


Einsteinian Relativity 


Chapter 1 
What Is Relativity? 


1.1. Influence of the Human Value System on the Evolution of 
Physical Theories 


The physical laws, the way we understand them, are man-made constructs 
of mathematical models, capable of discerning order and pattern in a 
myriad manifestations of the insentient, impersonal and impercipient 
nature. They are a code of conduct, believed to be obeyed by all children of 
nature — particles and objects of various hues and forms, forces and fields 
— in their mutual interactions, propagation and dynamical behaviour — as 
they weave a web of evolution through the vast expanse of space and time. 
It is not unnatural, therefore, that these codes of the impercipient world — 
being codification by the sentient human mind — should borrow from the 
ambience of human society, concepts, models, imageries and even value 
systems. 

For example, the principle of least action — which asserts that all 
dynamical processes of change in the universe are motivated by a 
propensity to extremize the “action integral” — which is the Lagrangian 
function summed over time — is an adaptation from human society of 
man’s own drive for optimizing the fruits of his efforts and the value for his 
money. As a corollary, it is the most sensible thing for an insentient electron 
or a missile to reach its target by that shortest route which also takes into 
consideration the compulsions of a prevailing ambient electric or 
gravitational field, just as it is the most sensible thing for a sentient human 
to reach his destination by that shortest route which also takes into 
consideration the various compulsions of his mission, places of special 
interest as well as lurking dangers in an unfamiliar terrain. 


As another example, the facility with which many of us handle, and 
even enjoy, trigonometrical exercises, lends a natural instinct to extend 
geometrical models for unfathoming mysteries of the physical nature. And 
what else can be a simpler geometrical object than a straight line of a 
measured length and orientation — more descriptively, a directed line 
segment — popularly known as a vector? Naturally, therefore, all of 
classical physics is dominated by the vector model of various physical 
quantities — like force, velocity, momentum, electric and magnetic fields 
and others — each one of which is modelled as the segment of a straight 
line whose changing length and direction tell us all about the dynamical 
nature of the prototype. Understanding, as well the application, of physical 
laws thus greatly simplify to the construction of straight lines and triangles, 
and interpreting them with the help of trigonometrical formulas. 

With the passage of time, human society has changed and its value 
systems have transformed, leaving their mark on the evolution of physical 
theories. The emergence of democracy as a universally accepted social 
order, for instance, also contained the seeds of two powerful streams of 
physical theories that have occupied the centre stage since the turn of the 
20th century. One of them, called quantum mechanics, starts with Einstein’s 
interpretation of photo-electric effect in terms of light quanta and de 
Broglie’s hypothesis of wave-particle duality. In effect, they served to 
remove any fundamental distinction between the world of radiation and the 
world of matter — treating both on the same footing, endowing both with 
wave and corpuscular attributes. The other stream of thought, called the 
theory of relativity proclaims democracy of all frames of reference. It starts 
with an outright rejection of the prevailing notion of an absolute frame, just 
as democracy starts with an outright rejection of an absolute ruler. 
Relativity teaches us to look upon all frames of reference as equivalent, just 
as democracy teaches us to treat different frames of opinion with equal 
respect. 


1.2. Rejection of Absolute Frame 


A frame of reference is a set of permanent benchmarks with respect to 
which the coordinates of all moving objects are defined. To concretize the 
idea, a frame of reference is often viewed in classical physics as a set of 
three mutually perpendicular rigid bars extending up to infinity (Fig. 1.1). 


These bars serve as the X-, Y- and Z-axes with respect to which the 
coordinates (x, y, z) of a moving object are measured. Time rates of change 
of these coordinates lead to velocity, momentum, acceleration and a host of 
other derived physical quantities. Systematic observation, followed by 
careful study and analysis of the collected data, often leads to the 
speculation of certain types of temporal, spatial and interconnecting 
relations among such physical quantities, which are enunciated as physical 
laws. From its very nature, therefore, a law is presumed to be valid in one 
particular frame namely the one from which, or with respect to which, the 
observations were made. 


Fig. 1.1. Coordinates of a projectile. 


The apparent diurnal motion of stars and planets across the sky, for 
instance, led to the earlier Aristotelian speculation of a geocentric model in 
which the earth was taken to be a “fixed frame” of reference at the centre of 
the universe. In the 15th century, Nicolaus Copernicus disbanded the 
geocentric model in favour of a heliocentric universe in which the sun is the 
centre of a fixed frame of reference with the entire clan of “fixed stars” 
forming a backdrop of reference points against which the earth, the moon 
and the other planets and their satellites moved. From the astronomical data 
of these motions, collected by Tycho Brahe and mathematically analyzed by 
Johannes Kepler, Newton arrived at the Universal Law of Gravitation. 


Thus, at least till the end of the 19th century, man thought only in terms 
of one supreme standard of frame of reference, the so-called Absolute 
Frame (often intuitively identified with the sun and the “fixed stars”) which 
also served as the standard of absolute rest. This also defined “motion” in 
an absolute sense, to be one in which the coordinates of an object, as 
measured in the absolute frame, are seen to be changing with time. Theory 
of relativity brings about a complete and radical overthrow of that attitude 
by declaring that such a notion of Absolute Frame, absolute motion, 
absolute rest are only a myth, which can never be substantiated by any 
experiment. Relativity asserts that all motions are relative, all states of rest 
are also relative — relative to the particular frame of reference we have 
fixed for the convenience of our observation and calculations. The so-called 
Absolute Frame does not exist, because there is no special attribute, special 
quality that can be experimentally measured, and then recognized to be an 
exclusive preserve of any particular frame. The laws of physics work 
equally well in all frames of reference, provided we write them in the 
correct mathematical language. 


1.3. Relativity Principle: A Rudimentary Form 


We can explain this in the following way. Suppose, either by subjective 
judgement or by established convention, one particular frame has long been 
looked upon as the standard, which, for fixing ideas, we take as the one 
riveted on the surface of the earth. A team of experimenters had discovered 
the laws of physics by performing a chain of experiments in this frame and 
analyzing the table of collected data. The same team now decides to repeat 
these experiments inside a mobile laboratory set up on a train which is 
coasting smoothly along a straight line with uniform velocity. (Imagine the 
journey to be so smooth that not even a sound or jerk comes from the rails.) 
Relativity foretells that the results of this second series of experiments will 
be identical to those in the first, that the new set of data tables will replicate 
the older one. This implies that the same laws of physics are at work when 
observed from the ground as when observed from the train. This also means 
that there is no experimental way of discriminating between the two frames, 
or of establishing any special feature shared by one of them — say the 
ground frame — and not by the other. This being the case, the 
experimenters have no objective means of deciding which one of the two 


frames is moving in an absolute sense and which one is stationary. They 
can, however, “look out”, see the trees, the mileposts, the hills and the 
rivers passing them by, and thereby conclude that their lab is moving 
relative to the ground. Alternatively, they can also believe that the ground 
— along with the hills and the trees — is moving while his lab is stationary. 
Relativity considers both these viewpoints equally valid and prohibits any 
objective judgement about who is moving and who is stationary, because 
motion is always and intrinsically relative. 

We can now summarize the above experience as a limited relativity 
principle in the following not-so-precise form. Two frames of references 
that are “non-rotating”* and moving with uniform velocity with respect to 
each other are equivalent and indistinguishable in all respects. This 
statement forms the core of the special theory of relativity. 


1.4. Inertial Forces 


If we take this viewpoint somewhat far, we can even elevate the discarded 
geocentric view of the universe from the dustbin of history to a status of 
equality with the heliocentric view. To be more precise, let us define the 
geocentric frame of reference as three mutually perpendicular rigid bars 
(serving as X-, Y-, Z-axes) riveted to the body of the earth and stretching up 
to infinity. The spirit of relativity would seem to suggest that this geocentric 
frame and the heliocentric one are equivalent. The universe revolving 
around the earth, and the earth revolving around the sun, are two apparently 
different, but legitimate, modes of description of the same schemes of 
nature at work. 

The reader will probably react to the above suggestions with disbelief. 
The universe with the stars and mighty galaxies all going around us! And to 
expect a modern scientific mind of the 20th century (or, 21st century) to 
place such a nonsense on par with the respectable and well-founded theory 
of the earth’s absolute rotation around its axis! Isn’t the fact that the earth is 
Slightly thicker around the equator — making it look more like an oblate 
spheroid than a sphere — enough evidence that the earth is absolutely 
rotating? The necessary flattening force on the earth — known as the 
centrifugal force — comes only because the earth is turning. How can it 
originate if the earth is believed to be stationary and, instead, the universe is 
made to revolve around? 


The centrifugal force, cited in the above paragraph as an evidence of the 
earth’s rotation, is an example of a class of inertial forces which — as every 
student of mechanics knows — needs to be “invented” in order to make 
Newton’s second law of motion valid in a general accelerating frame 
(within the class of which we shall include also a rotating frame of 
reference). Such inertial forces cannot be real, because they do not originate 
from any material source (the way gravity forces and electro-magnetic 
forces do). Being a parentless, illegitimate child, an inertial force is often 
called a “fictitious force”. Anyone who analyzes the motion from a “non- 
accelerating” frame does not see any inertial force at all. 

Inertial forces are daily encountered by every commuter when the train 
or the bus he rides suddenly starts or suddenly halts. At such moments he 
feels a jerky backward thrust or a forward pull — which are examples of 
inertial forces. An observer who watches the motion from the ground would 
argue that the sudden backward or forward movement of the body of the 
commuter is not due to a real force, but due to inertia inherent in the 
commuter’s body, in conformity with Newton’s first law of motion. 

As a general rule, an inertial force -ma is “imagined” to be acting on an 
object of mass m, when viewed from a frame of reference which is moving 
with acceleration a. Anyone trying to stand, or walk on a merry-go-round 
experiences two kinds of inertial force, namely a centrifugal force and a 
Coriolis force. The first one of them is velocity independent, whereas the 
second one is strictly proportional to the velocity of the walker. These two 
forces are exactly analogous to the forces exerted on a particle carrying 
electrical charge e by an electric field E (force = eE), and a magnetic field 
B (force = ev x B, where v is the velocity) respectively. If a platform is 
rotating about an axis with an angular velocity œ, then an object of mass m 
at a distance r from the axis experiences a centrifugal force directed 
outwards from the axis and having magnitude mw2r; whereas the Coriolis 
force is given by mv x 2m. A direct evidence of the existence of Coriolis 
force is provided by a Foucault pendulum, which is nothing but an ordinary 
pendulum with a rather heavy bob (so as to be relatively unaffected by air 
friction) and suspended by a very long thread from the ceiling of a very tall 
building. One such Foucault pendulum, which is in public display is the 
lobby of the UN headquarters in New York, is suspended from a 75-feet 
high ceiling. 


The plane of oscillation of this pendulum (or, in principle, any 
pendulum suspended from a fixed support in any earthly physics lab) will 
not be confined to a fixed vertical plane. Instead it will turn slowly with a 
period of rotation equal to 1/sin A days where À is the latitude of the location 
on the earth. 

Thus, we find that the slight bulging at the equatorial plane of the earth 
and the slow precession of the plane of oscillation of a Foucault pendulum 
show the existence of centrifugal and Coriolis forces at any place on the 
earth. They, in turn, provide the irrefutable evidence that the earth is turning 
around its axis. 

There is another absurd implication of the geocentric view which the 
reader may not have missed. A galaxy which is, say, one billion light years 
away, will be orbiting a circular path of 27 billion light years in just one day 


— suggesting an incredible speed of 730m x 10°c, where c is the speed of 
light. No physical theory will allow such a nonsense. 

The above example helps to underscore certain complexities and 
subtleties associated with the concept of frame of reference in the complete 
theory of relativity. It is not legitimate to think of a frame of reference as a 
non-rotating set of rigid bars stretching up to infinity. A frame of reference 
in relativity is a mathematical construct, sometimes lacking a complete 
visual picture. Two different frames of reference are related to each other by 
means of mathematical transformation equations satisfying certain 
conditions. With the choice narrowed down to legitimate frames of 
reference satisfying required conditions, the fundamental credo of relativity 
is still equal status for them all. We do not intend to pursue this argument 
further for the fear of straying away from our main objective. 

Fortunately, some of the above considerations do not fog the clarity of 
the special theory of relativity with which this book is primarily concerned. 
In this special theory, which Albert Einstein enunciated in 1905 while 
working as a clerk in a patent office in Zurich, does make a distinction 
between a class of “privileged” frames called the inertial frames and the 
non-inertial ones. Without going into proper definition right now, let us 
accept naively that an inertial frame of reference is the archetype of non- 
accelerating, non-rotating frames. They are the ones in which the inertial 
forces are absent, so that Newton’s first law of motion (often called the law 
of inertia) is strictly valid. In other words, an inertial frame of reference is 


one with respect to which a point particle will continue to move along a 
straight line with uniform speed, so long as it is free from external forces. 

The relativity principle, enshrined in the special theory of relativity 
proclaims that all inertial frames of reference are equivalent. This statement 
is arefinement of the relativity principle stated at the end of Sec. 1.3. 


1.5. Principle of Equivalence 


Even though our limited mission is an exposition of the special theory of 
relativity, it will be helpful to realize the central theme of relativity shared 
by both the general and the special theory. 

Einstein was not satisfied with the limited pronouncement of the 
relativity principle in special relativity, in which the inertial frames have 
been given a special status. In order to remove all distinctions between 
inertial and non-inertial frames, one has to appreciate the features by which 
one distinguishes a non-inertial frame from an inertial one. This feature is 
the appearance of the inertial forces in non-inertial frames, and its absence 
in the inertial frames, as already mentioned. 

Einstein himself considered the following thought experiment.” 
Suppose, while you are inside an elevator, someone cuts the supporting 
cable. The elevator will be falling freely in the earth’s gravitational field. 
But so will you, with the same acceleration g of the elevator so that you will 
be floating inside. Seen in another way, there will be an inertial force -mg 
acting on you (see Sec. 1.4), which together with the gravitational force mg 
of the earth will result in zero force on you. As a consequence, you will be 
weightless, just like an astronaut inside his space lab orbiting around the 
earth. 

On the other hand, suppose your elevator is moved upwards with a 
constant acceleration g (so that the acceleration of the elevator is —g). Then 
you will feel twice as heavy. Because now the induced inertia force will be 
mg which, along with the existing gravitational force mg, will result in a 
total force of 2mg. 

Thus, the inertial force has the remarkable property that it can get mixed 
up with the gravitational force to cause either a total or partial cancellation 
of the same, or an enhancement, or even change in the direction and 
magnitude of the same. 


As a preamble to the General Theory of Relativity, therefore, Einstein 


proposed his famous Principle of Equivalence." According to this principle, 
the inertial forces are equivalent to gravitational forces. One can generate 
the force of gravity of arbitrary direction and magnitude by accelerating or 
rotating his spaceship, or frame of reference suitably. One can, conversely, 
destroy the existing force of gravity at will by letting his frame of reference 
fall freely in the existing gravitational field. There is no experiment 
whatsoever — either in electrodynamics, or in optics, or in any other 
discipline of physical science — which can differentiate between the effect 
of “true” gravity force (produced by the earth, or the sun) and the effect of 
inertial forces induced due to acceleration of his frame. In other words, at 
least the local effects of inertial forces are exactly the same as that of the 
true gravity forces, so much so that the inertial forces are also a kind of 
gravity forces. 

A trivial example of the equivalence principle is the prediction that light 
bends downwards under gravity. Consider the same elevator which has a 
hole A on its eastern wall. The elevator is accelerating with an acceleration 
g upwards in early morning, when a ray of light, progressing along a 
horizontal straight line, enters through A and falls on a mark B on the 
western wall. Let the time of flight of the light ray from A to B be t. In this 
time, the elevator has moved upwards by a distance s = 4gt?. Therefore, the 
mark B must be the same distance s below a corresponding horizontal line 
AH drawn inside the elevator. Since the accelerating elevator in gravity free 
space is equivalent to a stationary elevator under gravity, light must fall by 
the same distance in a gravitational field g. In other words, the trajectory of 
a photon deviates from its straight line path by bending towards a 
gravitating mass. Such bending of a ray of light in a “true” gravitational 
field can be confirmed by measuring the deflection angle of a light ray 
when it grazes the periphery of the sun during a solar eclipse from its 
normal value. Experiments have confirmed this effect. 

In view of the equivalence principle, the concept of inertial frame — 
which is central in the theory of relativity — needs to be redefined. We can 
now define an inertial frame to be a frame of reference with respect to 
which all inertial and gravity forces are absent. In the vicinity of a 
gravitating mass (e.g. the earth) such a frame can be realized inside a non- 
rotating box which is falling freely in the given gravitational field. 


Einstein’s freely falling elevator and earth orbiting space-labs (which are 
also freely falling under gravity) provide best examples. 


1.6. Tidal Forces 


With the proclamation of the Equivalence Principle, gravity seems to 
disappear into thin air. We can create and generate the force of gravity at 
will by changing the frame of reference. Moreover, seen from a truly 
inertial frame, there is no gravity at all. What is then the fate of the 
Universal Law of Gravitation discovered by Newton, the triumph of which 
had been heralded by all the planets and satellites sailing in the sky? Does 
that also vanish into thin air? 

It is obvious that, since inertial forces and gravity forces are intimately 
intermixed, and since the inertial forces are off-springs of non-inertial 
frames, a truly generalized theory of relativity (i.e. the one that treats all 
frames of reference on equal footing) must also be a theory of gravitation. 
Indeed, Einstein’s General Theory of Relativity — which he published in 
1916 — is also the most modern theory of gravitation. It was the towering 
achievement of the genius of Einstein to isolate “real” gravity (e.g. the one 
that is generated by a massive object, like the sun) from piles of “spurious” 
ones (i.e. the inertial forces posing as gravity). 

The local effects of both gravities being the same (both can be created, 
or destroyed by suitably selecting frames of reference; both have the same 
effect on material particles and light), can there be some global effects by 
which the “real” gravity can be isolated? 

Real gravity manifests itself in the phenomenon of tide, which is a 
global effect.’ This effect can be seen in the deformation of an extended 
massive object falling freely under gravity. 

It should be easy to visualize the deformation of a spherical mass 
w.image of radius r which has been dropped from a height h above the 
surface of the Earth and falls vertically towards the Earth’s centre, as shown 
in Fig. 1.2(a). The particle A, being nearer to the earth than C experiences a 
larger gravitational force than C and falls faster than C. The particle B, 
being further, experiences a lesser gravitational force and falls slower. As a 
consequence C lags behind A, B lags behind C, and the diameter AB slowly 
elongates from r tor + 6. 


The particles E and F fall with almost the same acceleration as C, along 
the radial lines EO and FO, joining to the centre of the Earth. However, 
these lines come closer as #image comes closer to Earth. Therefore, the 
diameter EF contracts slowly from r to r - €. 

The net result is that a massive spherical ball which had been dropped 
from a height h above the surface of the Earth becomes an ellipsoid. An 
observer who sits inside a “rigid box” (which actually gets deformed by the 
tidal forces) shown by a rectangular frame, finds deformation of the 
originally spherical mass into an ellipsoid as the box falls from P to Q. 

If #simage is an entirely incoherent assembly of particles, falling toward 
a gravitating centre along an ellipse or a circle, its different parts will 
accelerate and fall in their own ways. As a consequence they will disperse, 
fall apart, and #simage will hardly look like one body after some time. 
However, extended objects, like comets, are not entirely incoherent, 
because the internal gravitational pull among different parts acts like a 
bond. But they become distorted. 


Fig. 1.2. Tidal deformation of a spherical object in free fall. 


The same distortion will occur if the massive spherical ball had been a 
satellite of the earth, orbiting the earth in a circular orbit, as shown in Fig. 
1.2(b). The mathematical analysis of this effect is not as simple as in the 
case of vertical free fall, but not so complicated either.® 

This distortion, due to differential accelerations of different parts of an 
extended body, is what we call tide. The regions of elongation, around A 
and B are the locations of high tide. The regions of contraction around E 
and F are the locations of low tide. The real gravity is characterized by tidal 
forces. 


1.7. The Scheme of General Relativity 


Acceleration induced (spurious) gravity can be transferred away 
everywhere by suitably selecting a “global” inertial frame. “Real” gravity 
can be transformed away locally, but not globally. There does not exist a 


global inertial frame in the presence of a gravitating body like the sun or the 
earth. Perhaps the following examples will clarify. 

Imagine a frame of reference S freely falling under the gravitational pull 
of the earth (and at the same time going in a circular orbit) as shown in Fig. 
1.3. From this frame, observe the motion of two balls A and B, both freely 
falling. Of the two, A is very near the origin of S, say within a “small” 
radius R (which has to be defined properly), whereas B is far away. In the 
frame S, A will be either stationary, or moving along a straight line with 
uniform velocity (at least for a limited span of time, depending on the initial 
velocity and initial location of A), whereas, B will be moving with a non- 
uniform speed. This means that the law of inertia is seen to be valid for over 
a limited region around the origin of S, and that this region, i.e. the radius R, 
shrinks smaller and smaller as time passes. The frame S is only locally 
inertial, locally with respect to both space and time. 


image 


Fig. 1.3. Local inertial frame S. 


A local inertial frame is analogous to Cartesian axes on the surface of a 
sphere. If the sphere is sufficiently large, like the earth, any city or town can 
be considered to be built on a surface which is flat locally, so that the map 
of the city can be drawn on a flat sheet of paper with Cartesian axes running 
in the W-E and S-N directions. On the other hand, it will not be possible to 
draw the map of the entire Asian continent, for instance, on a flat sheet of 
paper. 

The surface of the earth is locally flat, so that we can draw straight lines 
locally. But if we stretch two parallel straight lines too far, they will 
ultimately cross. In the same way, if we consider two balls, originally 
floating stationary near each other inside the “Einstein’s elevator” that is 
falling freely vertically downward, they will come closer to each other with 
the passage of time and will actually meet each other if the elevator is 
allowed to fall all the way to the centre of the earth. 

Einstein conjectured that if we consider the world line of an object 
moving in a gravitational field (world lines are trajectories of particles, 
photons, in a four-dimensional world, called space-time in which time also 


is a coordinate axis, discussed in Sec. 7.1), that world line will be the 
straightest possible path in a curved four-dimensional space-time. 

The straightest possible lines on a curved surface are called geodesics. 
Geodesics on the surface of our globe are also called great circles (e.g. the 
equator, the meridian circles). By analogy, Einstein proposed the famous 
geodesic hypothesis, according to which all freely falling objects, like the 
planets, satellites, moving in the gravitational fields of the sun or the earth, 
trace out geodesics (i.e. straightest lines) in a four-dimensional space-time. 
The path of a starlight progressing along a geodesic bends as it grazes the 
sun’s periphery, the effect discussed in Sec. 1.5. This bending of “straight 
lines” is the manifestation of curvature in the space-time. 

Coming back to the example of vertical free-fall of the particles E and F, 
alluded in Fig. 1.2(a), the geodesic lines of these two particles come closer 
to each other. This reminds us of the geodesic lines drawn on the surface of 
the earth meeting at some point. The geodesic lines that cross the equator 
perpendicularly converge at the North and South poles. This happens 
because the surface of the earth is curved. 

In the same way, the four-dimensional space-time is curved. There is a 
relative acceleration between two objects both of which are falling freely 
(the particles E and F in the above example). This relative acceleration, 
when seen in the four-dimensional space-time, constitutes the curvature of 
the space-time. “Gravitation is a manifestation of space-time curvature, 
and that curvature shows up in the deviation of one geodesic from a nearby 
geodesic (relative acceleration of test particles).”* 

Einstein constructed the Curvature Tensor, reshaped it through 
mathematical steps and identities into what is known as Einstein Tensor Et", 
and wrote the Field Equation of Gravitation in the esoteric form, known as 
Einstein Equation: 


#.image 


in which the source term TY” on the right-hand side is the energy tensor. 
This term is a generalization of mass density p used in the Newtonian field 
equation of gravitation written in the form of the Poisson’s equation 


#.image 


Note that the energy tensor T” replaces mass density p, because of 
mass-energy equivalence, and Einstein tensor EH” replaces V*@ where @ is 
the gravitational potential. The gravitational constant G = 6.67 x 10 !! 
m/s? kg is the common factor in both equations. 

We have discussed tensors and energy tensor in details in this book, but 
stopped short of the grand finale of celebrating Einstein’s equation, because 
we are not yet prepared for that great journey. 

How far are the predictions of Einstein’s theory from those of Newton? 

In the non-relativistic limit, i.e. when the source of the gravitational 
field is pure stationary matter, Einstein equation (1.1) converges to the 
Newtonian equation (1.2). 

In Newtonian theory gravity is an action-at-a-distance force generated 
by massive objects, e.g. the sun. The path of a test particle is determined by 
solving Newton’s second law of motion, which is a second-order 
differential equation. The second integral of this Equation of Motion (EoM) 
predicts an elliptic path that a planet must follow under the gravitational 
force of the sun. No such force acts on a massless photon, which must 
follow a straight path. 

In Einstein’s theory, all freely falling particles (i.e. particles falling 
under gravity) including a photon, follow straightest lines, or geodesics in 
the four-dimensional space-time. The path of a planet in the gravitational 
field of a star, e.g. the sun, when projected on the three-dimensional 
physical space, shows up as an ellipse, as in the Newtonian case. However, 
this ellipse has a slow precession rate, i.e. its major axis turns slowly in its 
plane about the sun. 


# image 


Fig. 1.4. The world line of a planet in the gravitational field of the sun (a) Kepler’s orbit, (b) 
Schwarzschild orbit. 


We have shown the precession of a hypothetical planet and its world 
line in Fig. 1.4. The effects shown in the diagrams are highly exaggerated to 
make an impression on the reader. The elliptic orbit shown in the diagram 
has an eccentricity of 0.7, whereas the maximum eccentricity of orbit 
pertains to Mars, having an eccentricity of 0.093 (so that all planets move in 
what appear to be circular orbits). The major axis of the orbit in the diagram 


shows an angular displacement of almost 45° in one revolution (i.e. in one 
planet year), compared to about 43” per century in the case of mercury 
(which has the maximum precession rate). 

Keeping these exaggerations in mind let us look at Fig. 1.4. In part (a), 
we have shown the Newtonian orbit, labelled as Kepler orbit, because 
Johannes Kepler had discovered the elliptic orbits of planets through 
detailed observations of their positions in the sky long before Newton. In 
part (b), we have labelled the precessing orbit as Schwarzschild orbit, 
because Schwarzschild solved Einstein’s equation for a spherically 
symmetric source, and obtained the geodesic of a planet under the new 
theory of gravitation. In parts (c) and (d), we have drawn the world lines of 
the planet moving along the said orbits. 

Without much ado let us now summarize in the following words. The 
nature of the curvature of the four-dimensional space-time is governed by 
the distribution of energy and momentum (in the case of the earth and the 
stars, the matter itself acting as energy). Einstein’s field equation, which 
establishes the relation between the distribution of energy-momentum and 
curvature, is actually equivalent to Newton’s formula of universal 
gravitation when the bodies producing curvature of space-time (like the 
earth, the sun) are stationary. Therefore, Einstein’s General Relativity 
theory does not invalidate Newton’s theory. It provides an alternative 
approach to gravitation through the path of geometrodynamics, an approach 
in which gravity is just a manifestation of the intrinsic geometry of space— 
time. Even though this approach has nothing in common with Newton’s 
universal law of gravitation, its ultimate predictions match those of Newton 
in a non-relativistic situation. However, Einstein’s theory leads to bending 
of light, black holes and many other phenomena that Newton’s theory 
cannot foresee. 


1.8. Conclusion 


Even though the last few sections digressed into a domain that has no direct 
application in this book, they may have placed the reader in a better 
perspective. Special Relativity hinges on the concept of inertial frames. The 
reader may have appreciated that this inertial frame is just an idealization, 
which does not exist globally. If anything, that was the one of the lessons 
conveyed by the last section. There is another justification. The fundamental 


creed of relativity is equivalence of frames of reference. Such equivalence 
beckons like a mirage as long as one is confined within the bounds of 
special relativity. To explain what equivalence truly means, one has to 
consider equivalence in its entirety, i.e. one has also to consider accelerating 
frames. However, the moment one considers accelerating frames, gravity 
enters by the back door. Thus, gravity and equivalence of frames are 
inseparable. 

The mention of relativity conjures up vision of a paradoxical world 
where space odyssey rejuvenates the youth, speeding clocks tick slowly and 
matter is transmuted into energy. How much of such mind-boggling stories 
truth and how many fiction? 

In the following chapters, we shall follow the relativity postulate to a 
logical end to seek answers to some of these questions. We shall discover 
that this innocent looking postulate (equivalence of frames of reference) 
contains seeds of epoch making consequences. One may not miss its 
philosophical message — that an abiding faith in certain values, when 
carried through trials and tribulations to its logical end, can lead to a world 
of miraculous discoveries. 

We quote the following lines from Professor S. Chandrasekhar [10] “It 
is an incredible fact that what the human mind, at its deepest and most 
profound, perceives as beautiful, finds its realization in external nature.” To 
which we add: What the human mind perceives as just and equitable finds 
its realization in the man-made structure of the physical laws. 


“Fix three gyroscopes with their axes perpendicular to each other. If the axes remain fixed in 
directions, then the frame of reference is non-rotating. 


b See Ref. [2]. 


“Equivalence Principle, in its “weak form” and “strong form”, needs to be understood for a study of 
Relativity, in particular the General Theory of Relativity. Read what the masters have written on this 
[3—5]. 


(The phenomenon of tide as the signature of real gravity is discussed by Taylor and Wheeler [6]. 


“See, for instance, [7]. Or, get the mathematical analysis with diagrams, in [8]. For a layman’s view, 
see [9]. 


fSee [5, pp. 17-18 in Chapter 1, pp. 218-219 in Chapter 8, pp. 265-271 in Chapter 11]. 


Chapter 2 


Einstein’s Postulates, Their 
Paradoxes, and How to Resolve 
Them 


2.1. Event Point in Space-Time 


Let us start with the following example. A rocket which was fired from the 
ground exploded in the atmosphere. This explosion is an event. Events such 
as this, and more varied than this, play a central role in the concept structure 
of relativity. We shall use double quote - -- double unquote to indicate an 
event. For example, we shall say that “the rocket exploded in the 
atmosphere” is an event, to be denoted by a symbol, say, “©”. 

An observer S in Delhi can pin-point the location and timing of the 
event “©” by stating that it occurred 200 km west, 250 km north, at a height 
of 60 km and at exactly 20 hours IST. To convey this information 
compactly, we could imagine a set of X-, Y-, Z-axes whose origin is in 
Delhi, such that the X-axis is directed eastward, the Y-axis northward and 
the Z-axis vertically upward. We shall designate this set of axes, or, the 
reference frame defined by this set of axes, by the symbol S. We shall often 
use the same symbol to mean either the observer or his frame of reference. 
With respect to S, the event “©” is completely identified by specifying the 
four numbers, namely, —200, 250, 60, 20 in this particular order. (The first 
number —200 means that the x coordinate of the event is 200 km in the 
negative X-direction.) These four members, when arranged in this particular 
order, are called the coordinates of “©” with respect to S.We write “©” = 
(—200, 250, 60, 20) with respect to S. 


Fig. 2.1 Coordinates of a particle with respect to frames S and S’. 


Conversely, every set of four numbers, arranged in the order (x, y, z, t), 
will be considered to constitute an “event”. (There is one anomaly in the 
above coordinates, namely x, y, z have the dimension of length, whereas t 
has the dimension of time. This anomaly will be removed in Sec. 3.1 by 
multiplying t with the velocity of light, so that all the four coordinates of an 
event will have the dimension of length.) 

Now consider two frames of reference S and S’, of which S is the 
Absolute Frame of reference conceived by Newton in which his laws of 
motion were assumed to be valid, and S' is another frame of reference (the 
earth for example) which is moving in the X-direction with velocity u, as 
shown in Fig. 2.1. 

Note that we have tagged the frames by means of flags, a convention we 
shall follow in the rest of this book. We imagine a set of three mutually 
perpendicular rigid bars, serving as the frames of reference — XYZ for S' 
and X'Y'Z' for S'. We have taken the X-axis of S, and the X'-axis of S' to be 
parallel and directed along the velocity u of the earth. Written component 
wise, u = (u, 0,0) with respect to S. 

Now imagine a particle P of mass m moving in space. Let A be a point 
on the trajectory of the particle. “The particle arrives at A” is an event. We 
call it the event “@,”. The coordinates of this event are as follows: 


a3 (x, y, 2, t) in S, 
y= A = 
(xy, z’ t) in S. 


For the configuration shown in Fig. 2.1, the coordinates of the origin O' of 
S', with respect to S, are Ry = (Xo, Yo» Zo) at t = 0, and R = (x, + ut, Yo, Zo) = 
Ro + ut at time t. It is obvious that 


yY = Y — Yo, 


yl 


(2.1) 


t’ =t. 


Written compactly, 


r=r—R=r-R,- ut, 


t 
t =t. 


Equation (2.1) and its equivalent form (2.2) constitute the familiar 
Galilean Transformation (GT) with the inclusion of the time coordinate. 


2.2. Inertial Frames 


2.2.1. Newton’s equation of motion 


Much of physics is based on mechanics and it deals with motion of objects. 
The physical quantities associated with motion are displacement, velocity, 
acceleration, etc. These quantities must be specified with respect to some 
reference frame. When we say that the velocity of a steamer is 25km/hr, this 
implies that the steamer recedes from a point fixed on the bank of the river 
at the rate of 25 km/hr. Here, the river bank constitutes a reference frame. 
We can imagine a boat on the river sailing in the direction of the steamer at 
a speed of 10 km/hr. This boat constitutes another frame. The velocity of 
the steamer with respect to this second frame will be 15 km/hr. Therefore, it 
is meaningless to talk about the laws of motion without having fixed before 
our mind a particular reference frame. What reference frame, then, did 
Newton fix before his mind when he enunciated the laws of motion? 
Newton presumed the existence of some Absolute Frame, as mentioned 
in the Sec. 1.2. He identified it with the frame of the ‘fixed stars’. In our 
discussion, we shall tentatively identify this so-called Absolute Frame (AF) 
as one fixed with respect to the sun. Now let us state the first two laws of 


motion, which, by assumption, are valid in this AF. The first law, also called 
the law of inertia, states that the acceleration a of a particle is zero in the 
absence of an external force. 

The second law states that a is not zero if the particle is acted on by an 
external force F, in which case F equals mass times the acceleration of the 
particle: 


F = ma. (2.3) 


Another Newtonian assumption is that the mass m of the particle is 
absolute. It does not change with the velocity of the particle and it is the 
same to all observers. 

Should the above two equations be valid on the earth which is moving 
with respect to the Sun? 

Strictly speaking, the answer is ‘no’. Not because the earth is just 
moving, but because the earth is rotating about its own axis and is also 
going around the sun in a circular motion. However, for the time being let 
us ignore the spinning motion and the orbital motion around the sun, the 
effects of which are relatively small. Let us assume, for simplicity, that the 
earth is moving, without spinning, along a straight line with a uniform 
speed of 30 km/s with respect to the sun. 

Let the velocity of the particle P be v with respect to S, and v’ with 
respect to E. Component wise, 


dr dx dy dz ) it] to S 
a ee a vith respect to S, 
: dt dt’ dt’ dt AS TERE 


dr (= dy’ dz‘ 


(2.4) 


y’ = — kantie = with respect to S. 
dt’ dt' ` dt’ dt’ ) 


Note from Eq. (2.1) that dt’ = dt, dx' = dx - udt, dy' = dy, dz’ = dz. 
Therefore, 
P dr — udt dy dz 
v = |_. —, — |], 
( dt dt dt ) (2.5 ) 


or w=v-u, 


which is the transformation equation for velocity. 


Let a and a’ denote the acceleration of P with respect to S and E, 
respectively. Then 


7 dv 
E dt’ 


, adv’ div—-u) dv 


dt dt dt 


(2.6) 


since u is a constant vector. We notice that even though velocity transforms 
under the GT, acceleration remains invariant. 

By our assumption, Newton’s second law of motion, as given by Eq. 
(2.3) is exactly valid in the Absolute Frame S. Also, the measure of the 
external force, as for example determined by the reading on a spring 
balance, should be the same in E as in S. Hence, from Eq. (2.6) 


F = ma’. (2.7) 


Thus, Newton’s second law of motion is valid in E, if it is valid in S. 

As a special case let F = 0. This would imply a = 0, as well as a’ = 0. 
This is Newton’s first law of motion, which is therefore valid in E, if it is 
valid in S. 

In summary, if the E frame is imagined to be a non-rotating frame 
moving with uniform velocity with respect to the S frame, then Newton’s 
first and second laws of motion — which are postulated to hold in S — also 
hold in E. These laws are valid in any frame S' with similar properties (i.e. 
non-rotating and moving with uniform velocity with respect to the AF). By 
induction, since these laws are valid in S', they are valid in any other non- 
rotating frame S” moving with uniform velocity with respect to S’. By this 
process, we obtain an infinity of frames of reference which are non-rotating 
and moving with uniform velocities with respect to one another, and with 
respect to the AF — and Newton’s laws of motion are valid in them all. 

A frame of reference in which the law of inertia (i.e. Newton’s first law 
of motion) holds is called an inertial frame (IF) — as we have discussed at 
some length in the previous chapter. We recognize that there is an infinite 
number of IFs. Let S be any one IF (not necessarily the AF) and let S’ be 
another. If, at t = 0, the origin of S' is at the coordinates rp = (Xo, Yo, Zo) and 
moving with velocity u with respect to S, then the Galilean transformation 
from S to S' gives us the following relations: 


r' =r — Tro — ut, 


t =t., 


v =v—u, 


All IFs will measure the same acceleration of a moving object. 

In the above we have considered only Newton’s first and second law of 
motion. What about the third law? The third law of motion is a corollary of 
the law of conservation of momentum. We shall now discuss the invariance 
of the laws of conservation of momentum and energy under GT. 


2.2.2. Conservation of energy and momentum 


Conservation of energy and momentum are among the foundational 
principles of physics. We shall show that these laws are valid in all inertial 
frames by examining a two-body collision. 

Figure 2.2 shows a view of two particles A and B engaging in a 
collision, as viewed from some inertial frame S. The collision — a term 
which is often used to mean a passing interaction between two particles — 
may result in the creation of new particles after the original ones have 
encountered each other. Therefore, for the sake of generality, we are 
considering two different particles C and D emerging from the scene of 
collision. 

In Newtonian physics mass is conserved, so that 


ma + meg = mc + mp. (2.9) 


Let us postulate that the total momentum and the total kinetic energy of an 
isolated system (i.e. a system which is not influenced by anything from 
outside) be each conserved in an elastic collision, when viewed from an 
inertial frame S. This gives the following two relations: 


(i) Conservation of Momentum: 


MAVA +MBVB = ™McCcVc +MpDVp. (2.10) 


(ii) | Conservation of kinetic energy: 
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We shall now use Eq. (2.8) to rewrite the above equations in terms of 
velocities, as measured in the frame S’. Letting v/s, VB» Vc. Vp be the velocities 
of the respective particles in the frame S', we have 


Fig. 2.2 Collision of two particles. 
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Ma(V4 +U) + MpBlV_g t+ U) = MelVe +U)+ MplVp +u). 


Using Eq. (2.9), the above equation reduces to the following form: 

MAVA + MBVg = McVo + MDV. (2.12) 
This shows that the total momentum, as measured in the frame S’, is the 
same after a collision, as it is before the collision. We now convert Eq. 
(2.11) along parallel lines: 
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Using (2.9) and (2.12) the above equation reduces to the desired form: 


l 2 1 72 1 a. 1 ,2 (2.13) 
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Equation (2.13) now validates conservation of the kinetic energy in the 
frame S’. 

In summary, if the laws of conservation of linear momentum and kinetic 
energy are established in one inertial frame, then they are established in all 
inertial frames, as a consequence of the Galilean transformation. We can 
extend the same principle to other physical quantities, like angular 
momentum and total energy (i.e. the sum of the kinetic energy and the 
potential energy) and establish that they are valid in all inertial frames. 


2.2.3. Equivalence of inertial frames 


The conclusions just reached in the previous sections lead to equal status of 
all inertial frames — at least within the limited scope in which we have 
examined them. They clearly tell us that all inertial frames are equivalent 
with respect to the experiments and laws of mechanics. A coin tossed inside 
a smoothly cruising jet plane will fall exactly in the same way as it will fall 
on the ground. Considered from a more general perspective, every IF has all 
the properties that characterize the hypothetical AF. 

This is the Newtonian principle of relativity. In essence it says that 
identical mechanical experiments performed in different inertial frames will 
yield identical results. The laws of conservation of mechanical momentum, 
mechanical energy and the laws of motion which are presumed to be valid 
in the AF are seen to be valid in all IFs, as a consequence of the GT. 
Therefore, there is no experiment, at least in the domain of mechanics, by 
which this hypothetical AF, even if it exists, can be identified. 


2.3. Historical Background 


2.3.1. Search for the absolute frame 


If it is impossible to tell, by any means whatsoever, which frame is AF and 
which frame is not, then why should there be any reason for hypothesizing 
an AF at all? 

With the formulation of the laws of electricity and magnetism, however, 
the need for this AF became evident. Clerk Maxwell rationalized the 
phenomena of electricity and magnetism into a set of equations known as 
Maxwell’s equation. Maxwell’s equations lead to wave equations for the 
electric and magnetic fields, showing a characteristic wave speed c which, 
in vacuum, equals 3 x 108 m/s, the same as the speed of light. This means 
that if you change the charge—current configuration somewhere in space, 
the electric and magnetic fields will change everywhere in space, but the 
field will change earlier in the near region and later in the far region. The 
messenger that carries the command for change from a near region to a far 
region is the electromagnetic wave and it propagates with the speed of light, 
somewhat in the same manner that ripples propagate the information of a 
disturbance on the surface of a pond with a much slower velocity. Radio 
waves, visible light, X-rays are all such electromagnetic waves, lying in 
different band zones of the frequency spectrum. 

Before the formulation of the theory of relativity it was generally 
believed by physicists that electromagnetic waves were similar to 
mechanical waves — like sound, seismic waves, ripples on the surface of a 
pond. Each of these examples is associated with a medium that carries the 
wave. A disturbance in the mechanical configuration occurs somewhere in 
the medium, and this information is sent outwards by the elastic properties 
of the medium in the form of a wave. Water surface is the medium for 
ripples, air for sound, earth for seismic waves. 

It was generally believed that the “mechanical medium” that carries the 
disturbance called electromagnetic wave, or light, is aether. Many 
physicists, including Maxwell himself, dabbled with the hypothetical 
properties of aether. Aether was thought to be a fluid that pervaded all 
Space, penetrated all materials, had some extraordinary properties, like 
perfect elasticity (so that no energy is extracted out of light when it 
propagates through it) and extremely high modulus of rigidity (so that light 
waves, oscillating at very high frequencies, of the order of 1016 Hz, could 
propagate through it). 


One now gets a clue for identifying the AF. There must be some frame 
of reference, say So, in which Maxwell’s equations are valid exactly. 


However, if they are valid exactly in So, then they cannot be exactly valid in 


some other frame S, because any GT applied to these equations will destroy 
their forms (equations of electrodynamics involve first-order derivatives 
whereas those of Newton involve second-order derivatives like at). 
Therefore, people were inclined to believe that all inertial frames were not 
equal, that among all the IFs there did exist one privileged frame, which 
alone was entitled to claim the equations of electrodynamics, and had, 
therefore, an absolute character. That frame of reference must be the long- 
cherished AF. 

It was therefore speculated that aether was at rest in the AF, and light, 
which propagates in aether, must have an absolute speed c in all directions 
with respect to this aether. 

How to identify this AF? The answer should not be difficult even for a 
lay reader. In any other IF, which is moving, say in the X-direction with 
velocity u with respect to the AF, the speed of light in the +X direction will 
be c — u, and in the —X-direction will be c + u. The velocity of light in 
different directions should in fact be different in this new IF. Among all the 
IFs there is one, and only one, IF in which the speed of light is same in all 
directions and that frame alone is to be identified as the AF. 

An experiment can be devised to measure the difference in the 
velocities of light in two different directions on the surface of the earth. 
This will give us immediate information about the velocity of the earth 
relative to the AF. A large number of ingenious experiments, of which the 
Michelson and Morley’s experiment is most well known, have been 
performed to measure this difference. Contrary to everybody’s expectations 
no difference has ever been found. All experiments on the velocity of light 
have unmistakably shown that light propagates with the same constant 
speed c in all directions in vacuum, at all times and in all seasons, in the 
frame of reference of the earth. 

The Michelson—Morley experiment attempts to determine the velocity 
of the earth relative to aether. Alternatively, since aether is assumed to be at 
rest in the Absolute Frame, the outcome of the experiment should determine 
the velocity v of the earth relative to the AF. It was found from this 
experiment that v is zero. 


In summary, it will be sufficient to say that the theory of relativity is 
founded on the premise that there is no aether and no Absolute Frame. 

We shall now give a brief account of the Michelson—Morley 
experiment. 


2.3.2. Michelson-Morley experiment 


The Michelson—Morley experiment (to be abbreviated as MM experiment 
in the following) utilizes Michelson interferometer. Figure 2.3(a) describes 
the basic set up of the apparatus. Here S is a source of a monochromatic 
light. A collimator takes a parallel beam out of this source. This beam is 
split into two components @; and @> by the partially silvered mirror A. The 
component @,, which is transmitted through A towards B, gets reflected 
back at the mirror B, and then comes back to A. The other component @, is 
reflected upwards at A, goes to the mirror C, and is reflected back to A. 
These two beams, after return to A, get partially reflected again and 
partially transmitted again at A, so that a fraction of each one of them, say 
half of @,; and half of $», will now recombine and proceed along the path 
AD as a single beam. This recombination of two fractions of what used to 
be a single beam earlier, after they have travelled through two different path 
lengths, causes what is known as “optical interference”. The lens will focus 
these interfering beams onto a screen. 
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Fig. 2.3. Schematic arrangement of Michelson-Morley experiment. 


Suppose the earth is moving, with respect to aether, with velocity v in 
the direction of the line AB, which we take as the X-axis. This means that 
aether is moving in the negative X-direction with speed v with respect to the 
earth. Since light travels in aether with speed c, the speed of light in the Lab 
frame as it travels along the +X-axis will be c — v, and along the -X-axis 
will be c + v. If T; is the time required for the beam @, to go from A to B 


and then travel back from B to A, then 
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Now consider the second beam @» following the path ACA, which it covers 
in time T>. To an observer at rest in aether, this beam must follow the 
slanted path AC’A”, shown in Fig. 2.3(b), in order for it to reach the mirror 
A which has moved to the point A” during the same time T>. If v is the 
velocity of aether with reference to the lab and c is the velocity of light with 
reference to aether along AC’, then the velocity of the beam @> in the Lab 
frame along the upward path AC is the vector sum c + v. The magnitude of 
c+vis 


In the same way, one computes the speed of the beam in the Lab frame 
down the path CA to be 


Thetimetaken by the beam @> to cover the path ACA is therefore 
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Therefore, there is a time difference 
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where A is the wavelength of the light beam used. As the earth goes round 
the sun in a circular orbit, its velocity relative to the AF keeps changing 
direction. Six months later, the direction of the velocity of the earth, which 
was in the +X-direction at the start of the experiment, will now change into 
the +Y-direction, so that the aether wind will now blow in the —Y-direction 
in the reference frame of the earth. The roles of the paths ABA and ACA in 
the above experiment will now get interchanged, so that now 
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Hence, the new fringe order at the centre of the screen will be 


n=-—-mn. 


Therefore, in six months the interference pattern will shift through 


2Lv" K 
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In the actual experiment no such fringe shift was observed. 


2.4. Postulates of Special Relativity 


The null result of the MM experiment demolishes the notion of Absolute 
Frame. Special Theory of Relativity therefore starts with the premise that 
all IFs are equal — not merely with respect to the laws of mechanics, but 
for the whole of physics. We have, however, seen that the laws of 
electrodynamics do not appear to be the same in all IFs. The answer to this 
paradox lies in recognizing the fallacy of the Galilean Transformation. 
Before showing how the paradox is resolved satisfactorily, it will be 
desirable to enunciate the two fundamental postulates of Special Relativity 
on which its entire concept structure rests. 


Postulate 1 (Relativity Postulate). All inertial frames are equivalent in the 
sense that identical experiments performed in different inertial frames will 
yield identical results. 


We mentioned earlier Maxwell’s equations lead to a characteristic speed c 
of light (by light we shall mean any electromagnetic wave). These 
equations also show that light emitted by a moving source (for example, 
every charged particle under acceleration radiates light) propagates in all 
directions with equal speed c, and c is independent of the velocity v of the 
source. This is a key notion on which further progress of our theory 
depends. Hence the following postulate. 


Postulate 2 (Source-independence of the speed of light). Light 
propagates without any medium with a speed c whose value is same in all 
directions, and is independent of the velocity of the light-emitting source. 


By combining the above two postulates we get a very important corollary. 
Consider a light emitting source L which is moving with respect to a frame 
S. There is some comoving frame Sọ in which L is at rest, and light emitted 
by it propagates at speed c. Since propagation speed does not depend on the 
velocity of L (postulate 2), and since this characteristic speed should be 
same in all frames of reference (postulate 1), propagation speed c in Sp (in 
which L is at rest) should be same as in S (in which L is moving). Hence the 
following corollary. 


Corollary 2A. The speed of propagation of light is given by c = 3 x 108 m/s 
in all inertial frames, independent of the motion of the light emitting source. 


2.5. Relativity of Simultaneity 


Let us imagine two inertial frames S and S’. S' is moving relative to S with 
velocity u in the direction of the X-axis, which is taken parallel to the X’- 
axis of S' (Fig. 2.4(a)). Times t in S and ¢’ in S' are measured from the 
instant when the origins O and O' of the two frames just pass each other. At 
that very moment a sharp flash of light is emitted from a source L which is 
fixed to the origin of S', so that L is stationary in S', but moving with speed 
u in S. According to Corollary 2A, light will be propagating radially with 
speed c with respect to S, as well as S. This means that a spherical 
wavefront Ł, diverging from the origin O of S with speed c and having 
radius ct, will contain this light flash at the instant t. Similarly, another 
spherical wavefront =’, diverging from the origin O' of S' with speed c and 
having radius ct’ at the instant t’, will contain this same light flash. If the 
clocks of the observers S and S' are assumed to tick at the same rate, then at 
the instant t = t', the same flash of light is simultaneously contained in two 
different wavefronts £ and &’ which have the same radius ct = ct’. This is 
absurd. What has gone wrong? 


Fig. 2.4 Simultaneity paradox. 


We went wrong by believing in one universal time which, as we shall 
find, does not fit with the postulates of relativity. Consider the same light 
flash as discussed in the previous paragraph. Imagine two points P and Q 
on a sphere of radius ct in frame S (Fig. 2.4(b)). Two events, e.g. “light 
reaches P” and “light reaches Q” which we designate as “Op” and “Oj” 
respectively, both occur at the same time t and are, therefore, simultaneous 
in S. Their coordinates are 


“Op” = (%1, 41, 21, t) Sao 
in 5. 


“OQ” = (T2, Yo, 22, t) 


However, in this time t, O' has been displaced by the distance ut to the right 
of O. Consequently, P and Q no longer lie on the same sphere with centre at 
O' (Fig. 2.4(c)). P will be nearer to O' than Q. 

Therefore, the event “@p” occurs earlier than the event “Og”, according 


to S. The coordinates of the same events with reference to S' will be 


and ¢< ts. We therefore see that the relativity principles are in 
contradiction with the Newtonian concept of universal time. Time assumes 
a relative character in relativity. In particular, we have just discovered the 
following important rule. 


Rule 1. Two events which are simultaneous with respect to an observer S 
cannot be simultaneous with respect to another observer S’ who is moving 


relative to S (unless the direction of motion happens to be perpendicular to 
the straight line joining the spatial locations of the events). 


One can advance arguments to show similar discrepancy with respect to 
distance measurement also. Time and distance are both divested of 
absoluteness in relativity. They give different measures to different 
observers who are moving with respect to each another. 

Relativity rejects many of the intuitive notions of Newtonian physics. 
One of the first casualties is Galilean transformation, as the following 
exercise will illustrate. 

Consider an event “®” namely “reception of the light flash at some 
point P”. Let the coordinates of this event be (x, y, z, t) in the frame S. This 
means that the light ray has covered a distance \/x? + y? + 2? in time t, as 
measured in S. Therefore, 
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This is an equation of a family of concentric spheres, representing 
wave-fronts W4, W>, W3, . . . , corresponding to different times t4, to, t3,... , 
as seen from S, and as shown in Fig. 2.5(a). In order to obtain the equations 
of these wavefronts in the frame S', we shall have to transform the above 
equation with the help of Eq. (2.1). This gives the desired equation: 
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(x! + ut)? +y? 427 = ete. (2.15) 


Fig. 2.5 Wavefront series seen from S and S’. 


Equation (2.15) describes a series of spheres A4, A>, A3, . . . whose centres 
Cy, Co, C3, . . . are located along the negative X-axis, as shown in Fig. 
2.5(b). However, Corollary 2A requires that both S and S’ should see 
concentric spherical wavefronts with centres fixed at their respective 
origins. Therefore, a new set of transformation equations between (x, y, Z, t) 
and (x’, y’, z', t) is required to replace the GT. This new transformation 
should satisfy the requirement that if the coordinates of the event “®” with 
respect to S satisfy (2.14), then the coordinates of the same event “®” with 
respect to S’ must satisfy a similar equation, namely 
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As a prelude to the new transformation rule (to be called Lorentz 
transformation), we shall consider two paradoxical and important 
consequences of our postulates, namely time dilation and length 
contraction, in the following sections. 


2.6. Time Dilation 


Imagine the Michelson interference experiment being performed on a train 
which is moving with velocity v with respect to the platform as shown in 
Fig. 2.6(a). Let “a” represent the event that the ray @ — after coming from 
the source S — is “reflected upwards at the mirror A”. Let “B”represent the 
event that this ray — after bouncing downward from the overhead mirror 
— is “received back at the mirror A”. 
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Fig. 2.6 Coordinates of a particle with respect to frames S and E. 


An observer sitting in the train — call him Mr P (P for passenger) — 
marks the path followed by this ray. We shall denote his frame of reference 
— i.e. the frame fixed on the floor of the train — by the same letter P. Let 
the length of the arm AC, as measured in the frame P, be Lo. Since light 
propagates with speed c in P, the time of flight Tp of the ray @» in covering 
the round trip ACA, and as measured in the frame P, is 

2Lo 


To = . (2.17) 
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Another observer Mr S (S for Station Master), standing on the railway 
station, watches the same pair of events “a” and “6”. He finds them taking 
place at two different points A and A" on the track (Fig. 2.6(b)). Light takes 
a longer route AC’A” in the frame S, and therefore, travelling with the same 
speed c (according to Corollary 2A), must take a longer time T (as 
measured in S) in covering this route. Assuming that the length of the 
vertical arm of the MM apparatus (i.e. the perpendicular distance between 
C' and the line AA") measures out to be the same length Lo in S as in P (we 
shall justify this statement later), the length of the path traversed by light, as 
seen from the frame S, is /L2 + {22}2, which should now equal cT. 


Therefore, 
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Using Eqs. (2.17) and (2.18), one obtains a relation between T and Tp: 


so that 


To = yı —— T. (2.19) 


Note that Tọ and T are the time intervals, as measured in P and S 
respectively, between the same pair of events “a” and “p”. 

The reader may conclude from the above discussions that “moving 
clocks go slow”. Mr P is moving and Mr S is stationary! Therefore, Mr P’s 
clock registers smaller time than Mr S’s clock. However, such arguments 
are wrong and contradict the very spirit of the relativity principle. 

The fallacy in the above statement lies in that if S can claim that P’s 
clock is going slow because P is moving (relative to S), then P can also 
claimthat S’s clock is going slow because S is moving (relative to P). As we 
have stressed in Chapter 1, motion is relative. And though it may sound 
strange, relativity postulates find both P and S to be correct. Each one’s 
clock is going slow with respect to the other. 

Let us resolve the paradox implied in the last sentence. We define 
proper time between any two events to be the time interval between these 
events, as measured in one particular frame of reference So in which both 
the events happen to take place at the same spatial location. What we mean 
by this is that if two events have coordinates (x1, Y1, Z1, t4) and (Xo, yo, Zo, t2) 
in So, such that x; = X2, Y1 = yo, Z1 = Zo, then we call Tọ = t> — t, the proper 
time between the events. It should be remembered, however, that for a 
particular pair of events there may or may not exist a frame in which both 
the events will occur at the same location, and hence, there may or may not 
exist a “proper time” between these events. 

In the above example, the events “a’and “p” occur at the same floor 
location A in P’s frame of reference, whereas they take place at two 
different locations, namely A and A”, along the railway track, as seen from 
S’s frame of reference. Therefore, P measures proper time between “a’’and 
“B” whereas S measures “improper” time. Please note that we are using the 


terms “proper” and “improper” not to mean right and wrong. Proper time 
is just a nomenclature, a definition. “Improper time” is any time interval 
that does not satisfy that definition. It has been assumed that both observers 
P and S are using standard clocks for measuring time intervals, and, 
therefore, both measurements, i.e. “proper time” and “improper time”, are 
correct measurements, but with respect to different observers. It should be 
understood that the term “proper time” has no meaning except with 
reference to a particular pair of events. 

The correct conclusion from the result of the above calculation is that 
the proper time interval Tọ between a given pair of events “a’and “p” is 
always less than the corresponding “improper” time interval T between 
them. 

It will be convenient to introduce at this stage the Lorentz factor, y 
which we define as follows: 


vV B 
y = ——, Where 8 = -. (2.20) 
m 


Since the Lorentz factor y defined above is associated with the operation 
boost (see the meaning in Sec. 3.1), we shall in future call it boost Lorentz 
factor, in order to distinguish it from the dynamic Lorentz factor T to be 
introduced later through Eq. (4.15), in Chapter 4 

Note that 


and the following identities which can be proved easily: 


=7787, 1-—= 6, —=7(1-87), 7?-1= 78". 


Using the Lorentz factor, we shall summarize the time dilation formula 
(2.19) in the form of the following very important rule: 


Rule 2. Let there be a frame of reference Sọ with respect to which two 


ee 99 


events “a and “p” occur at the same spatial location. Let S be another 
frame of reference which is moving with uniform velocity v with respect to 


So. If To and T be the time intervals, as measured in Sọ and S, respectively, 


66,99 


between “a and “p” (so that Tp is the proper time between the events), then 


Ty = V1- PT = —T. (2.23) 


2.7. Length Contraction 


In a similar vein, we can define proper length to be the length of an object 
in its rest frame, i.e. the length measured in that particular frame in which it 
is at rest. 

Let us once again examine the MM experiment being conducted on the 
train as was shown in Fig. 2.6. This apparatus is stationary in the frame of 
reference of the train. Therefore, the length Lo of the arms AB and AC, 
when measured with meter sticks laid on the floor of the train (or held 
stationary on board the train), are the proper lengths of these arms. The 
same lengths measured by meter sticks laid on the ground may be called 
“improper” lengths of these segments. Instead of using meter sticks, an 
alternative and better means of the length measurement will be with the 
help of light beams, which we shall now employ for conceptualizing the 
length paradox. 

Let us, therefore, assume that both P and S measure the length of the 
arm AB using the time of flight of a light ray from A to B and then back 
from B to A. Let these measurements be Lg to Mr P, and L to Mr S. We shall 
establish a relationship between Lọ and L with the help of relationship 
(2.19) between proper time and “improper” time. For this purpose, we shall 
identify three significant events along the journey route ABA of the ray of 
light. These events are: 


e “O41” = “the light ray passes through the half silvered plate A” (on its 
way towards the mirror B). 

e “Opg” = “the ray is reflected back at the mirror B”. 

e “Oa?” = “the ray returns to the half silvered plate A” (after being 
reflected at the mirror B). 


As light travels with velocity c in the reference frame of the train, Mr P 
measures the same propagation time T{ for the light ray to go from A to B, 
i.e. between the events “@,,” and “Op”, and also to return from B to A, i.e. 
between the events “Op” and “@,>”. The total time of the round trip flight is 
then Tp = 27{. The total length travelled during this time is 2Lo. Therefore, 
2Lọo = 2cT{ = cTọo. Hence, 

Lo = ed (2.24) 

Mr S observes the same phenomena from his own reference frame, the 
platform. According to his watch, the light ray takes time T; for its forward 
trip, i.e. between the events “@,,” and “Opg” during which time the mirrors 
move from the locations A and B to A’ and B’, and a different time T> for the 
return trip (i.e. between the events “Opg” and “@,>” during which time the 
mirrors move from A’ and B’ to A” and B"), as suggested in Fig. 2.7. 

It should be clear from these diagrams that cT; = L + vT}, and cT, = L - 
UT», so that Ti = 45; T2= AR. 


Fig. 2.7. The ray Qj traced by Mr S. 


The total time for the round trip according to Mr P is then T = T; + Tp = 
=r. Therefore, 


Note that Tp, appearing in Eq. (2.24), is the “proper time”, and T 
appearing in Eq. (2.25) is an “improper time”, between the events “@,,” 
and “O42”. Connecting Eqs. (2.25) and (2.24) with the help of Eq. (2.23) 
one now obtains in a straightforward way the required relationship between 
L and Lo, which we have written below as Eq. (2.26) at the conclusion of 


the following important rule. 


Rule 3. Let Lo be the proper length of a rod (as measured in its rest frame 
So). Let the rod be moving longitudinally with velocity v with respect to 
another frame S (i.e. v is parallel to the axis of the rod), and let the 
(improper) length L of the rod, as measured in the frame S, be L. The 
relation between the two is given by the formula: 


Lo = ———L = 4L. (2.26) 


Compare this with the relation between the proper time and improper 
time as given in Eq. (2.23) 

Since y 2 1, it is seen from Eq. (2.26) that Mr S measures a smaller 
value for the longitudinal dimension (i.e. the dimension which is parallel to 
its direction of motion) than Mr P. He therefore concludes that when an 
object moves, its longitudinal dimension contracts. 

It will be in order to suggest an operational model of length 
measurement both for subsequent reference as well as for elucidation of the 
meaning of Eq. (2.26). The length of a moving stick can be measured by 
taking its shadow-graph on a photographic plate laid on the floor (or on a 
table) of the laboratory as the stick shoots past it. We have illustrated this in 
Fig. 2.8(a). An overhead array of flashguns has been provided for casting 
shadow on the plate. The end guns G, and G2 are triggered simultaneously, 
so that the shadows P, and P» of the endpoints A; and A, of the stick are 
also etched simultaneously on the plate. (Note that these etching events are 
simultaneous in the Lab frame only.) The distance L between the marks P4 
and P» (which are permanent marks on the table) measured subsequently 
using any standard meter stick will give the laboratory measure of the 
length of the moving stick. It is this L which is related to the “proper 


length” Lo of the stick through Eq. (2.26). (We have obtained this 
relationship using Lorentz transformation in Sec. 3.3). 


G G Proper Length 


P, ee 
cB 


“Improper” Length 
L 


o 
L=- 
Y 


(a) (b) 
Fig. 2.8. Measurement of longitudinal length. 


We shall adopt the following mathematical description for the outcome 
of the above length measurement experiment. 


Theorem 2.1. Let R and Rọ be two straight rods sliding past each other 
longitudinally with a relative speed cB. If the segment PiP2 of R coincides 
with the segment A;Az of Ro, when viewed from R (so that R is stationary 
and Rọ is seen to be moving), and if the proper lengths of PiP; and A;Az are 
L and Lo, respectively, then 


L = EF (2.27) 


A popular way of saying the same thing is: a rod of proper length Lo 
shrinks to a smaller length L = Lo, when moving with velocity Bc. We have 
portrayed this concept graphically in Fig. 2.8(b). 

Theorem 2.1 will play a very important role as the mathematical model 
of length measurement in the subsequent development of the formalism. 

In Sec. 3.6, in Chapter 3 we have reinforced the concepts of 
simultaneity, time dilation, length contraction through Theorems 3.1-3.3. 
We have shown through actual calculations: 


e the exact time difference between two events in a second frame, 
when the same events are simultaneous in the first one; 

the relation between the two different lengths €’ and £" measured in 
a given frame of reference, when they match the same length £ 
measured in another frame of reference. 

the relation between time intervals Tj and T; in a given frame of 
reference when the corresponding time intervals are the same in 
another one. 


The reader should understand the logic presented there in order to get a 
better grip of these exotic concepts. 

We shall now take up transverse length and suggest a reason why the 
transverse dimension of an object (i.e. the dimension measured 
perpendicular to the direction of its motion) should not change due to the 
motion. 

Imagine two identical and parallel rods L and R, each equipped with 
markers at their ends (Fig. 2.9). Let A and B be the markers of L, and C and 
D the markers of R. When at rest, the marker A coincides with C, and the 
marker B with D, thereby confirming that the proper lengths of the rods are 
equal. Now let the rods move transversely towards each other. As they zip 
past each other, the markers mark the opposite rods. If the length of the rod 
L, as seen from the rest frame of R, remains unchanged, then the marks 
made by A and B will fall on C and D, respectively. Otherwise, only one set 
of markers, say, A and B of the rod L, will be able to make marks on the 
body of the other rod R, implying thereby that the rod L has become shorter 
due to its motion relative to R. An observer in the rest frame of R would 
then conclude that the length of the rod L has shrunk due to its transverse 
motion. 


p~ 
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Fig. 2.9 Measurement of transverse length. 


An observer in the rest frame of L can also examine the marks. There 
cannot be disagreement between the two observers about the fact that these 
marks lie within the body of R. Therefore, this observer would conclude 
that the length of the rod R (which is in motion relative to L) has expanded 
due to its transverse motion. 

Thus two different observers observe two different effects of speed on 
the transverse dimension of a rod (i.e. expansion in one case and 
contraction in the other). This contradicts postulate 1. Hence, the conclusion 
that transverse dimensions cannot change due to motion. 

We shall therefore record the above finding in the following rule. 


Rule 4. When a rod moves transversely, as seen from a frame S, its length L 
as measured in S equals its proper length Lo. 


We shall supplement rules 3 and 4 with two important corollaries. We 
illustrate them with the help of Fig. 2.10. It shows an object A of arbitrary 
shape. Sọ is the rest frame of the object (we assume that every part of A is at 


rest in So) 


Corollary 4A. Let the rest frame Sọ be moving with velocity v along the X- 


axis with respect to another frame S. Let the dimensions of the object along 
the X-, Y- and Z-directions be 6Xo, Yo, dZ in Sp and ôx, dy, dz in S. Then 


d2=—d2%9; dy=dyo; dz = d20. (2.28) 


Fig. 2.10 Volume transformation. 


Corollary 4B. If 6Vo is the proper volume of the object (i.e. the volume 
measured in So), then its volume, as measured in S is 


ôV = Lsv. (2.29) 


With the foresight gained through the discussions presented in this 
section, we shall now search for the correct transformation equation 
connecting the coordinates of a given event in two different inertial frames 
of reference. 


Chapter 3 


Lorentz Transformation 


3.1. Lorentz Transformation I: Special Case 


We shall obtain the transformation equations between the coordinates (x, y, 
z, t) and (x', y’, Z', t) of an event as measured in two frames of reference S 
and S’ which are moving relative to each other. The underlying concepts are 
best exemplified by considering the simplest example, namely boost in the 
X-direction. 

We shall use the term boost to mean relative motion between two 
frames of reference. Figure 3.1 illustrates a boost cB of S' relative to S along 
the X-axis, which we shall write compactly as “boost: S(cB, 0,0)S'”. Note 
that 


boost:S(e8,0.0)S’ = boi st9’(—e 3.0.0)5. (3.1) 


That is, the above arrangement also means a boost —cf of S relative to S' 
along the X'-axis. A more general boost will be 
boost:$(cf,,¢ Jys Cz 1S’ = boost: S'(—c ; —cby, —cp, )S. (3.2) 

For all such boosts, we have the following assumptions: 

#1 The coordinate axes XYZ of S are parallel to their counterparts X'Y'Z' of 
ce 

#2 The time origins of both frames are chosen to be the instant (i.e. the 
clocks in both frames are set to zero hour at the instant) when the origins 
O and O' of the space frames cross each other. 

#3 For the special case of boost:S(cB, 0, 0)S', we further assume that S’ is 
moving in the direction of the X-axis with velocity v = cf relative to S, 


and that the X’-axis lies along the X-axis. 


We shall call this configuration the standard configuration. 


O 


7 X 
/ Va 


Fig. 3.1. Standard configuration of frames S and S’. 


An event “O” viewed from these two frames of reference is also shown 
in Fig. 3.1. The transformation equation that we shall obtain for this 
particular case is a simplest special case of a general class of relativistic 
transformation equations, called Lorentz transformation, which we shall 
often abbreviate as LT. 

An anomaly exists among the coordinates x, y, z, t in the sense that the 
last coordinate has a different dimension than the first three. The usual 
convention in relativity is, therefore, to adopt ct for the time coordinate. 
This sounds reasonable because of the frame independence of the speed of 
light. From now on, the coordinates of every event will be written as (ct, x, 
y, Z), with the time coordinate preceding the space ones,and each one of the 
coordinates having the dimension of length. In conformity with this 
practice, we shall measure all time intervals in the unit of ct. We shall, for 
example, say that a certain event has occurred at the instant ct, or that the 
time interval between two events “a” and “p” is côt. 

Let us therefore think of an event “©” whose coordinates are (ct, x, y, Z) 
in S and (ct’, x’, y’, z') in S'. We shall find a relationship between the two sets 
of coordinates. For convenience of picturization imagine, the event is a 
lightning which strikes along the X- and X'-axes. We think of these axes as 


infinitely long straight rods sliding longitudinally along each other (Fig. 
3.2). The lightning “©” leaves permanent marks Q' on the X’-rod and Q on 
the X-rod. Since y and z coordinates of this event are identically zero in both 
S and S', we shall ignore them for the time being and write the event “O” as 


ah cs o (ctx) mS, 
“O” = “Q passes Q” = g (3.3) 
(ct’,2") in S, 


where x = OQ = proper length of the intercept OQ, and x' = OQ = proper 
length of the intercept O'Q’. 


O P x— Bet Q X 


Fig. 3.2 Lightning © as seen from S and S’. 


Figure 3.2(a) shows a view of the event as seen from the frame S. We 
have represented the viewing frame with continuous lines and the viewed 
frame with broken lines. The origin O’ of S’ is seen to coincide with the 
mark P on the X-rod at the instant when “©” occurs. By this we mean that 
the event “®” = “O' passes P” is simultaneous with “©” in the frame S, i.e. 
they have the same time coordinate ct in S. From assumption #3, OP = Pct. 
According to (3.3), the “moving segment” O (moving to the right) and 
the “stationary segment” OQ have proper lengths x’ and x, respectively. The 
proper length of OP = Pct. Therefore, the proper length of the segment is 
given by PQ = 0Q - OP = x — fct. Also, seen from S, the “moving segment” 
OQ coincides with the “stationary segment” PQ). Hence, from Eq. (2.27), 


Hence x = y(Bet' + x”). 


Let us now view the same lightning, i.e. the event “©” from the frame 
S', represented in Fig. 3.2(b). The origin O of S, while moving to the left, is 
seen to coincide with some mark M' on the negative X'-axis at the instant 
when “©” occurs. By this we mean that the event “®” = “O passes M” is 
simultaneous with “©” in the frame S’, i.e. they have the same time 
coordinate ct' in S'. By assumption #3, OW = Pct’. According to (3.3), the 
“stationary segment” O and the “moving segment” OQ (moving to the 
left) have proper lengths x’ and x, respectively. The “moving segment” OQ 
and the “stationary segment” M'O have, therefore, proper lengths x and ct’ 
+ x’, respectively, and they coincide when seen from S. Therefore, by using 
Eq. (2.27) again, 
, ' T 
bet +r = —. 
(3.5) 


Hence xr =^ ( Bet! + r’). 


Eliminating x'in Eq. (3.5) with the help of Eq. (3.4) we get 


a = y7(a — Bet) + Bet’. 
y7-1 (3.6) 


r= (ct — Br), 


Or ct!’ =yect — 


ane) 


where we have made use of Eq. (2.22). 
Alternatively, one can eliminate x in Eq. (3.4) with the help of Eq. (3.5), 
leading to 


2 


(3.7) 


Equations (3.6) and (3.4) represent the equations of transformation from 
(ct, x) to (ct, x’). Equations (3.7) and (3.5) constitute the inverse 
transformation, i.e. from (ct’, x’) to (ct, x). 

The above equations are supplemented by the equations of 
transformation for the y and z coordinates corresponding to a more general 
event “®” having arbitrary non-zero values for all the four coordinates. 
Therefore, let a lightning “®” strike an arbitrary point T, which for the 
convenience is identified as the top of a tower. We assign to this event the 
coordinates (ct, x, y, Z) in S and (ct’, x’, y’, z’) in S’. 


Fig. 3.3. Lightning striking a tower. 


We have shown the event, as viewed from S, in Fig. 3.3. (Note that 
neither of the S and S’ frames is a rest frame of the tower.) Assuming that 
the base of the tower is on the XZ-plane, the height of the tower should be y 
in S and y’ in S'. According to Rule 4, y = y’. By a similar argument, z = z’. 
Let the tower top T be projected to the mark Q on the X-rod and Q' on the 
X'-rod, as shown in Fig. 3.3. Then the event “©” = “Q passes Q” is the 
same event as represented in Eq. (3.3) whose coordinate transformation we 
have just worked out. Summarizing all that we have so far accomplished, 
we can write the transformation equations of the coordinates for any 
arbitrary event “®” as follows: 


(i) Transformation equation from S to S': 


(3.8) 


ct = y(ct’ + Be"), 


x = (2! + Bet"), 


(3.9) 


The transformation equations (3.8) and (3.9) are the simplest examples 
of Lorentz transformation. We shall refer to them as the standard Lorentz 
transformation corresponding to the standard configuration shown in Fig. 
cee 

Two inertial frames of reference S and S' that are connected to each 
other by a Lorentz transformation — as in (3.8), (3.9), or more generally as 
in (3.18), (3.19) — have an alternative name: Lorentz frames. 


3.2. Lorentz Transformation II: General Case 


As we found out in Sec. 3.4, one reason why GT is not acceptable within 
the scheme of special relativity is that it yields different speeds of light in 
different inertial frames, whereas Corollary 2A in Chapter 2 requires frame 
independence of the speed of light. It is now incumbent on any relativistic 
transformation equation to meet this requirement of Corollary 2A. 

Consider an event “E”, namely “emission of a flash of light”, which 
takes place at the origin of S at ct = 0. Consider another frame S' obtained 
from S by a more general boost, as defined at the beginning of Sec. 3.7. By 
assumption (b), the event takes place also at the origin of S' and at the time 
ct’ = 0 (as measured by S'.) Therefore, the coordinates of “E” are: 


_ (0,0,0,0) ins 
E” : f 
(0,0,0.0) in S’ 


Let “R” represent a subsequent event namely “reception of the flash of light 
by an observer”. Let its coordinates be 


“R”. (ct, £, Y, 2) in S 
1 ° rs . 
(ct, x,y,z) nS’ 


Therefore, Corollary 2A in Chapter 2 requires that the following two 
equations be both satisfied [refer to Eqs. (2.14) and (2.16)]: 


et? — | r? + y? + z? ) = 0, 
(3.10) 
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Et? — (x? + y’? + 


12 


)= 0. 


Thus, we define Lorentz transformation to be a linear transformation 
between the Cartesian coordinates (ct’, x’, y’, z') and (ct, x, y, Z) of any event 
“P”, such that 


CEP — (2? + y? + 2?) = ct? — (2? +: 27 + 2”). (3.11) 


(Properly speaking, what we have defined is a homogeneous LT which 
corresponds to the fact that the clock in either frame is set to zero when the 
origins O and O' pass each other.) The adjective “linear” used in the above 
definition implies that each one of the coordinates (ct’, x’, y’, z") is a linear 


function of the coordinates (ct, x, y, z) and vice versa. This is the same thing 
as saying that there exists a matrix 2, with constant elements Qj; such that 
ct’ Ooo 201 Qo2 Noz ct 


= g a of. (3.12) 


We shall call this matrix the Lorentz transformation matrix. Here note that 
the rows and columns of the above matrix have been indexed by the 
numbers 0, 1 , 2, 3. This is in conformity with the indexing convention we 
shall adopt in Chapter 7. By the way of illustration, we shall retrieve the LT 
matrix associated with the boost cß in the X-direction, out of Eq. (3.8), 
which can be rewritten in the following matrix form: 


ct! A —¥ 5 0 0 ct 
- od (3.13) 


0 0 0 1 


Hence, the Lorentz transformation matrix associated with a boost cf in the 
X-direction is as follows: 


Q= ‘ | (3.14) 


We shall now prove that the Lorentz transformation represented by Eq. 
(3.8), which is also equivalent to Eq. (3.13), satisfies the requirement of Eq. 
(3.11). 


Proof of Eq. (3.11) 


Ny 


C 22 — (£? +y? +27) = {7 (ct — Bax)}? — {7 lee 


= ct? — (£? +y? + 27). (QED) 


Let us now take up the case of a general boost {S(6)S'}, as illustrated in 
Fig. 3.4. In this case, the boost velocity is p, in any arbitrary direction n. 
That is, the velocity of the origin O' of the frame S' is 6 = Bn with respect to 
the origin O of the frame S. Here n is a unit vector in the direction of p. 

We have set up a new set of coordinate axes (é, ¢, n) to replace (x, y, Z) 
in S, and (€’, Ç', n^ to replace (x’, y’, z') in S’, such that (i) the axes € of S and 
¢' of S' coincide and are oriented in the direction of p, (ii) the origins O and 
O' coincide at t = t' = 0. We shall adjust the simple LT given in (3.8) to the 
general LT pertaining to this general case. 

Let r and r’ represent the radius vectors from O and O’, respectively, to 
the location of the event ©, as measured in S and S’, respectively. We shall 
resolve these vectors parallel to and perpendicular to the direction n: 


r= e; + Ce; + ne, =€n+r, =r, +r, (3.15) 


where 
rı =n = e,, the component of r parallel to 8, 


rı = eç + Nen, the component of r perpendicular to 8. 


Fig. 3.4. A general boost. 
Now, we shall obtain the analogues of Eq. (3.8): 
(3.8a) > ct’ = y(ct — BÈ) > ct’ = yit — B-r), 


(3.8b) > £'=y7(£ — Bet) => ri = ¿E'n = 7(€ — Bct)n 


(3.16) 
= 7(r, — Bet), 
From the last two lines, 
r’ = 4 + A =( Yri tL ) m Bet 
= (ry +r) + (y -— 1)ry — yGet 
=r + (y — 1)rı — ct 
=r + (y —1)(r -n)n — yet. (3.17) 


Noting that n = 2, we can summarize as follows. The Lorentz transformation 
corresponding to the general boost {S(£)S'} has the following form: 


ct’ = y(ct — Br) (3.18a) 


= ct + [(y — 1)et — y8 - rl]; (3.18b) 

r’ = (yr, +r) — yGBet (3.18c) 

a — 1 i l 

=r + |—— (6 - r)8 — 7Get (3.18d) 
g4 

=r + [(y— 1)(r - n) — y8ct]n. (3.18e) 


We have written each transformation in two or three different forms, so 
that the reader will have several choices for different applications. 

The inverse of the above transformation will correspond to the boost {S' 
(-B)S}, and can be obtained from (3.18) by exchanging (ct, r) with (ct’, r') 
and replacing p with —f: 


ct = ct’ + [(y — 1)ct’+ y8 - r'], (3.19a) 


l 
r= r + |——(68-r)B+7GBct| . (3.19b) 


> 
b- 


We have used the coordinates (é, ¢, n) and (é', Ç, n^ as a tool for 
obtaining the relation (3.18). Now, we discard them, and interpret r to mean 
(x, y, zZ), r’ to mean (x’, y’, z’). Moreover, 6 is the boost velocity with 
components (x, Py, Bz), and £ - r stands for xB, + ypy + zB;. 

The Lorentz transformation matrix corresponding to Eq. (3.18) is as 
follows: 


A —75, —7By —78 
y=] a y-1.. y— 1 
-ybr 1+ z De —— By Bx —— [52 Sx 
p- ri“ ~ {5+ l ; 
i = ~—] ~y—l ~—] . (3.20) 
—7 By —— Bz By 1+ =~ by — Pz Py 
i 6 j“ 
y— 1 y— 1 1+ y¥-l 
yb, — 5.5 —— p, 8 ——— p4 
32 = 32 y 32 


To see that “the boost in the X-direction” is a special case of Eqs. (3.18a)— 
(3.18e), one needs to set B = (P, 0, 0) in these equations to get back 
equations in (3.8). 


All relativistic formulas must reduce to the corresponding non- 
relativistic forms in the limits of small velocities. This is a general 
requirement which we shall use from time to time as one of the checks 
whenever any new result will be derived. In the present case, the non- 
relativistic limit implies y = 1. Using this approximation in Eq. (3.18), it is 
easy to see that 


ct’ = ct. 
(3.21) 
r =r — c(t. 


These are identical to the homogeneous form (2.2), i.e. corresponding to ro 
= 0 if we recognize that u = cp. 

It will be useful to note at this point that the requirement of Eq. (3.11) is 
satisfied also by a pure rotation, i.e. the transformation in which the (x’, y’, 
z") coordinates are obtained from (x, y, z) (or vice versa) by a rotation of the 
axes XYZ to X'Y'Z’', thereby leaving ct’ = ct. Such a transformation of (x, y, 
z) > (x', y', zZ" falls under the general class of three-dimensional orthogonal 
transformation (see also Sec. 7.5). In fact, the definition (3.11) is also 
satisfied by a pure space reflection, i.e., X' = —x, y' = —y, z' = —z, ct' = ct and 
a pure time reversal, i.e. x' = x, y' = y, z' = z', ct' = —ct. What we shall call 
Lorentz transformation in this book will include (a) pure boost, (b) pure 
rotation and (c) a combination of boost and rotation, but will exclude space 
reflection and time reversal. Such a transformation is normally called 
proper Lorentz transformation. It will be a useful exercise for the reader to 
show that the effect of two successive boosts is in general equal to one 
boost and one rotation, so that boost and rotation are in fact inseparable.* 


3.3. Simple Applications of Lorentz Transformation 


In order to illustrate the meaning of LT, we shall use Eqs. (3.8) and (3.9) to 
retrieve the time dilation and length contraction formulas derived in Secs. 
2.6 and 2.7. 

Consider two events “A” and “B” having coordinates 


. (ct, 2, y, 2) in S, 
“A e i 
(ct’,2',7/,2') in sS, 
(ct + cAt, x + Az,y + Ay, z + Az) in S, 
(ct! + cAt’, r + Aa’, y + Ay’, 2'+ Az’) in S’. 


We can obtain the transformations of the coordinates of “B” and “A” 
and take the difference to get the transformation of the coordinate 
differences: 


cAt = 7(cAt — BAT), 
Ag’ = 7(Azr — BcAt), 
(3.22a) 
Ay’ = Ay, 
Az’ = Az, 
cAt = y(cAt’ + BAT’), 
Ag = y( Axr’ + BcAt’), 
(3.22b) 
Ay = Ay’, 
Az = Az’. 


These equations mean that the coordinate differences follow the same 
Lorentz transformation as the coordinates themselves. 

Now consider two events “A” and “B” which occur at the same spatial 
location (x', y’, z'), but at two different times ct’ and ct' + cAt', as seen from 
the frame S’. In that case Ax’ = 0, so that At’ is the proper time between 
these events. The corresponding time interval At measured in the frame S is 
the “improper” time, and is obtained from the first one of transformation 
equations given in (3.22b), 


At = yAt’ = yAr. (3.23) 


We recall Rule 2 given in Eq. (2.23). Note that we have used the symbol At 
to mean proper time between the two events. 

Let us elucidate with a commonplace example. A train arrives at Bhopal 
at 10 hours, and at Nagpur at 18 hours. These two arrivals are examples of 
the events “A” and “B” considered here. They occur at the same spatial 
location (x', y’, z) with reference to the frame S' of the train but at different 


spatial locations with reference to the frame S of a station master which is 
fixed on the ground. The time interval At’ = At = 6 hours as noted by a 
passenger of the train is the proper time between reaching Bhopal and 
reaching Nagpur. The corresponding time At that is recorded by the station 
master is in this case “improper time” and will be more than 6 hours. 

Let us now consider the length measurement of a moving rod AB using 
shadow-graph in a laboratory S, as illustrated in Fig. 2.8. Two flashes of 
light emitted simultaneously by the flash guns G; and G, graze past the 
endpoints A and B of the speeding rod and etches marks P4, P> on the 
experiment table, as illustrated in Fig. 3.5. 

Two events are involved in this measurement: “A” = “A coincides with 
Pı” and “B” = “B coincides with P2”. The flash guns are riveted to the 
experiment table in the Lab frame S and the points P4, P» are fixed points 
on this table. The distance between them is Ax = L in S. Also they take 
simultaneously in S (but not in S’), so that At = 0. 


4 
light flash 


Moving Rod 


Fig. 3.5 Length measurement using LT. 


The rest frame S’ of the rod is moving with velocity Bc with respect to S. 
The endpoints A and B of the rod are the locations of the events “A” and 
“B” as seen from S’. The distance Ax’ = Lo between these events is the 


“proper length” of the rod, which is now determined from the second one of 
transformation equations given in (3.22a). 


Az’ =7yAzr, or Lo = 7L. (3.24) 


We recall back the length contraction formula (2.27). 


3.4. Time-Like, Light-Like, Space-Like Intervals 


The coordinate difference between two events satisfies the Lorentz 
transformation according to Eq. (3.22). Hence, the property (3.11) should 
also be shared by the coordinate difference, i.e., 


PAL — (Az? + Ay? + Az?) = PAL? — (Ax? + Ay? + Az?) © As?. 


(3.25) 


We have denoted this invariant quantity by the symbol As*. The 
quantity As as written above will be called the square of the interval, or (in 
order to be brief) just the interval, between the two events. This interval 
will be called 


e time-like, if As? > 0; 
e space-like, if As* < 0; 
e light-like, if As? = 0. 


The significance of these strange names can be explained as follows. 


e If the interval is time-like, then we can find a frame of reference S’, 
moving with a boost velocity cB; P < 1, with respect to S, such that the 
events occur at the same spatial location, say the origin, but at different 
instants of time, say, t, and tg = tą + At, where At is the proper time 
between the events. In this case, Ax’ = Ay’ = Az' = 0 and cAt' = cAt. In 
other words, the coordinate difference has only time component, but no 
space component. Hence, the name time-like. 

Note that, for a time-like separation between two events, Ar = £. 
If the interval is space-like, then we can find a frame of reference S’, 
moving with a boost velocity c6; P < 1, with respect to S, such that the 
events occur at the same time, say at t' = 0. That is, they are simultaneous, 
but occur at different spatial locations, say, x’, and x'g = x‘, + Al.. In other 
words, the coordinate difference has only space component, but no time 
component. Hence, the name space-like. 

Note that, for a space-like separation between two events, there is no 
proper time, because As is imaginary. 

If the interval is light-like, then there is no frame of reference in which the 
events can occur at the same spatial location. In other words, the two 


events cannot be connected by any frame of reference. Only a light 
signal, e.g. “8,” = a radio message is sent; “0?” = the same message is 


received, can connect the events. In this case, the proper time At = 0. 


3.5. Relativistic Doppler Formula 


As another interesting application of LT, we shall obtain a relativistic 
formula for the Doppler effect. Figure 3.6 shows a transmission tower of 
height h moving in the X-direction with velocity cB. At the top of the tower 
is the transmitter K. Fixed on the ground is the receiving station O. 
Moreover, S and S' are, respectively, the rest frames of the receiving station 
and the transmitter. Consider the following events: 


e “Ox” = “the transmitter transmits a sharp beep signal”; 
e “Mo” = “the receiver receives that signal”. 


We write event coordinates as (ct, x, y), leaving out the z coordinate. 
Noting that x = Bct, we write 


E (ct',0,h) in S’, 
"Ok = 
(ct, 6ct,y) in 5. 


Bet i 


Fig. 3.6 Explaining Doppler effect. 


Applying the Lorentz transformation equation (3.9) to the time coordinate 
and to the y coordinate, we get 


ct=yct; y=h. (3.26) 
Since the reception takes place at the origin of S, we can write its 


coordinates as “Oy” = (cT, 0, 0) in S, where T is the time when the signal is 


received. The signal has propagated from K to O, a distance of yz}? in 
time (T - t). Therefore 


(Sct)? +k? = (T -t)?. (3.27) 


Using Eq. (3.26), we get 


Simplifying, 


f i + t? — 29 Tt’ = Ta; (3.28) 


which is the relation between the transmission time ť' and the reception time 
T, each measured at its respective station. 

Now, let there be a continuous beam of a sinusoidal radio wave 
transmitted from the moving tower, whose frequency is measured to be fo at 


the transmission station and f at the receiving station. The wave consists of 


a series of crests separated by a time interval which equals d = + and 


df = + as measured at the transmitting and at the receiving station, 
respectively. Therefore, by differentiating (3.28) we get the required 
relation between fo and f: 


dT 7 yT -t 
dt? T-t" 


(3.29) 


It follows from Eqs. (3.26) and (3.27) that T -4# =T -t = y Ž$4 = 4, where 
l= length of the hypotenuse OK. Also 
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Therefore, from (3.29) 


zi (1+ 6 A) (3.30) 
—— — y| Öö Cos p), (J.i 
dt ; aad 
where 0 is the angle of elevation of the transmitter when viewed from O. In 
other words 


Í o 


= 74 ba (3.31) 
i y(1 + 8 cos @) (3.31) 


Equation (3.31) represents the relativistic Doppler formula. The angle 0 is 
to be interpreted as the angle between the direction of motion of the source 
of light and the ray line. The reader must have realized that this formula is 
valid only for light transmitted by a moving source. 

We shall write t = + and specialize the formula for three special cases: 


4 = 0 longitudinal; source receding r = y —— 
1 


0 = — transverse; source vertically up = =, (3.32) 


as ~ [1+8 
ð = m longitudinal; source approaching I = y Tos 
j — 9 


For non-relativistic Doppler effect, we set y = 1, and get 


f fo 


ee (3.33) 
(1+ Scosé@) 


f= 


which is valid when the transmission tower (or the source of the wave) is 
moving with speed v « c. This formula is valid for sound as well if we 
substitute for c the speed of sound [11]. 

The simple look of the Doppler formula (3.31) is misleading. In actual 
application, it can present conceptual and mathematical challenges. We 
have presented two examples for a better appreciation of the formula. One 
of them is a detailed workout in Sec. 8.9.5. The other one is Exercise R3 for 
the reader, in Sec. 8.10. 


3.6. Worked Out Problems I 


Problem 3.1. Let R and R’ be two straight rods sliding past each other with 
relative speed cB as shown in Fig. 3.7. Prove the following theorem as a 
corollary of Theorem 2.1. 


Theorem 3.1. Let the segment AB of R coincide with the segment A’B’ of R' 
when viewed from R (Fig. 3.7(a)), and with the segment A’B” of R' when 
viewed from R’ (Fig. 3.7(b)). If €' and £" be the proper lengths of A’B’ and 
A'B”, respectively, then 


p fo n 
= — (3.31) 
J y(1+ 5 cos) l i 


View from R at ct=ct'=0 


Fig. 3.7. Problem 3.1. 


Solution to Problem 3.1 

The proper length A’B’ shrinks to proper length AB when seen from R. 
The proper length AB shrinks to proper length A’B” when seen from R’. 
Let £ be the proper length of the segment AB. 

Hence, by Theorem 2.1, 


Therefore, 


Problem 3.2. Prove the following theorem as a corollary of Theorem 3.1. 


Theorem 3.2. Consider two events “0” and “@” occurring at the locations 
(A, B) on R and (A’, B') on R'. Let these events be simultaneous when 
viewed from R (so that ctg = ctg). Then, the time interval between these 


events, when viewed from R', is given by 


c(t, — ti) = Ae’, (3.35) 
where €' is the proper length of the segment A’B’. 


Solution to Problem 3.2 


(a) Solution using length contraction formula 
In the following, we shall write (A : A’) to mean the event: A passes A’ = A’ 
passes A. 

Consider the following three events 8 = (A: A’), @=(B:B’), w=(B: 
B"). 

Seen from R, 0 and @ are simultaneous, so that ctg = Cto. 

Seen from R', 6 and y are simultaneous, so that et, = ct. 

Seen from R', as the rod R moves to the left, the event @ precedes the 
event W. 

According to Theorem 3.1, and Fig. 3.7, the proper length of B”B’ 
equals @ — 0" = (1- =) = pe, 

The mark B on the rod R, moving to the left with velocity cB, covers up 
the proper distance B”B’ in time At = = = =, and the event w occurs after 
the event @, so that t}, > t. 

Therefore with respect to the frame R' 


fal fy fal fy of 
c(tg — ta) = e(t,, — ta) = BU. 
; $) (Cy $ 


(b) Solution using Lorentz transformation 

We have pictured the standard configuration in Fig. 3.8. The “moving” rod 
R' is stationary in the frame S', and the rod R is stationary in the frame S. 
The events 8 and @ are simultaneous in S. Their coordinates (ct, x) in S and 
S' are written as follows: 


ĝ = (cto, za) = (0.0), o= i Cth, To )= ( 0, £ ) 


0 = (cto, x4) = (0,0), ġ= (ct, x4) in S’ 
Apply LT: ct, = ¥(ct, — Brg) = 7(0 — Bl) = —yßl. 
y({f — B x 0) = yêl. 


d asf ap í ry j — 
a $ = Y(T — pe to jJ = 


Fig. 3.8. Problem 2.2. 


Note that 2’; = ¢’, i.e. the proper length between A’ and B'. Therefore, 


from Eq. (3.36d): & = y£; 


in S 
(3.36a) 


(3.36b) 
(3.36¢c) 


(3.36d) 


from Eqs. (3.36b) and (3.36c): ef to — t) = 0 — (—y 88) = ype = Be’. 


Problem 3.3. Prove the following theorem. 


(3.37) 


Theorem 3.3. Let “0”, “Q” and “y” be three events, such that “0” and “@” 
occur at the same spatial location in S, “0” and “y” at the same spatial 
location in S'. If the time intervals, as measured in S', are T{ between “0” 
and “@’”, and T; between “8” and “y”, and if “@” and “y” are simultaneous 


in S, then 


T; = 


, 
Ti 
2 


najé 


(3.38) 


Compare Eq. (3.38) with Eq. (3.34). (See Example 3.5 for illustration of 
this theorem.) 


fa MA Ly of (b) 


Fig. 3.9 Problem 2.3. 


Solution to Problem 3.3 

(a) Solution using time-dilation formula 

In Fig. 3.9, we have explained the configurations of the frames S and S’ at 
the three events (0, @, y), which we identify with three lightning strokes 
striking at different times as shown in the figure. We take the timing of the 
event 8 to be To = T% =0, that of the event @ to be Tj, Tj and of the event w to 
be T>, T3, with respect to S and S', respectively. Then 


Tı = proper time between @ and @, so that Ti = Ti /Y 
T; = proper time between @ and w, so that T3 = T,/7 
Also, Tı = Tz by assumption (3.39) 
Hence, Ti/y = 773 


Or. Tt = T” /y?. 


(b) Solution using LT 
8 = (0.0); @ = (T 1,0); Y = (cT2,24) insg 
6 = (0.0); 6 = (T'i, xh); Y = (cT3,0) in S 
Apply LT: cT} = y(cT; — 8 x 0) = yc (3.40) 
Apply inverse LT: cT = y(ceT3 + 8 x 0) = ycT3 


By assumption: T2 =Tı. Hence, cT} = ycT2 = y°cT3. 


Problem 3.4. (a) Rewrite Eq. (3.12) in the form: 


a = Ay. (3.41) 
where 
ct’ ct 
r’ r 
X'= X = 
y’ y 


are column matrices. Let Ñ’ = (et',x',y', 2"), Ñ = (ct,æ,y,z) be transposes of X' 
and X, respectively, and let 


á l 4 0 
G=]|- f (3.42) 


(This ĉ is identical with the metric tensor to be discussed in Sec. 7.9.3.) 
Show that Eq. (3.11) can be expressed compactly as follows: 


X'GX' = XGX. (3.43) 
(b) Hence, establish the following important property of any LT matrix: 


GO = G. (3.44) 


where Âf represents the transpose of ĉ. 


Solution to Problem 3.4 


(a) 
1 0 0 0 ct’ 
/ 
taeda PO Ht 8 0 I 
l.h.s. = (ct, 2", y, z") 
0 0 -1 0 y’ 
0 0 0 -1 d 
— 2#2 _ p’? — y? — 2 
Similarly, the r.h.s = ct? — 2? — y? — 2. 
(b) 


X' = (ct',x', y’, z) = (OX)F = XN, 
X'GX! = (XNF)G(QX) = X(T GN)X = XECX. (3.45) 


Hence, 27 GQ = G. 


Problem 3.5. Let “01”, “@2” be simultaneous events in S occurring along 
the X-axis. Show that they remain simultaneous events in S under the 
boost: S(0,cB,0)S', or more generally under the boost: S(0,cB,, cB,)S'. 


Solution to Problem 3.5 
Refer to the Lorentz transformation (3.18) for the general case. Note how 
time transforms: 


ct’ = yle- B-r). (3.46) 


Suppose the first event occurs at the common origin of the frames S and S' 
when the time is set to zero in both frames. For the second event fp - r = 0 


according to our assumption. Hence, cť = 0. 


Problem 3.6. As shown in Eq. (3.22) the coordinate differences between 
the events “01”, “@5” undergo the same LT as the coordinates. We shall 


rewrite the same equation more clearly as follows. 


elts — ti =y [e(te — tı) — (z2 — 27, I, (3.47a) 
(v5 — 21) = y[(v2 — 21) — Beltz — tı )l, (3.47b) 
(y> — y1) = (Y2 — 41), (3.47c) 
(25 — 21) = (22 — 21). (3.47d) 


Using the above transformation equations, prove the following statements: 

(a) There exist frames A and B having inverse time relations for the events 
“0,” and “0,” (i.e. if “04”? happens before “8,” in A, then “8,” happens 
after “92” in B), if and only if there exists a frame S in which these 
events are simultaneous. [In fact there exists an infinity of such frames. 
See Example 3.11. ] 

(b) The temporal sequence of the events “8,” and “0” is the same in all 
frames (i.e. if “01” occurs before “02” in some frame A, then the same 
must be true in all frames) if and only if there exists a frame S in which 
these events occur at the same spatial location, in which case the least 
time interval between the events is the (proper) time measured in S. 


Solution to Problem 3.6 


(a) Set c(t — t,) = 0 in Eq. (3.50a). Get 


c(t, — ti) = —yB(r2 — 71). 
- i 7 S ; (3.48) 
Therefore, tá — ti S0 for 8 positive/negative. 
(b) Set (x2 — x1) = 0 in Eq. (3.50a). Get 
clth — th) = ye(to — t4). 
; 7 ! (3.49) 


Therefore, t3 — t4 > 0 for any value of y, since y > 1. 


3.7. Illustrative Numerical Examples I 


Take c = 3 x 10° m/sec for numerical exercises in this chapter. We have 
used “light-second” (abbreviated as It-sec) as an alternative unit of length. 1 
It-sec = 3 x 10° m. While working out the exercises, the reader may find it 
convenient to convert “It-sec” unit to “meter” unit by multiplying with c. 
For example, 5.2 It-sec = 5.2cm. 

Examples 3.1-3.6 allude to an imaginary superfast (and superlong) 
space-train called Cosmos Express shown in Fig. 3.10. It is coasting with a 
uniform speed 4=#.. Its (proper) length is € = 6 x 10° m (i.e. 2 lt-sec). 
Along its straight line route lie two space stations named Andromeda and 
Vega (no relative velocity between them) manned by the respective station 
masters Amar and Vivek, to be referred as Mr A and Mr V. Their offices 
(shown by the letters A and V) are D = 3 x 10° m (i.e. 10 It-sec) apart. 
Peter, the passenger (to be referred to as Mr P), is travelling in the centre 
coach. The intervening terrain between A and V being treacherous 
(occasional attacks by space pirates), Peter’s mother prays for his safe 
passage, as she sits quietly at A. Clocks in the space station frame S and the 
train’s frame S’ are set to zero, i.e. t = t = 0, when “P passes A”. At the 
same instant “mother starts praying”. 


D=10 lt-sec E =2 It-sec B=4/5 d=1.2 It-sec 


Fig. 3.10. Cosmos Express running between S and V. 


In the following questions, all primed quantities (e.g. T', D') refer to 
measurements in S', and the unprimed quantities to those in S. All messages 
between individuals or stations are radio messages (so that they are 
transmitted with the speed of light). Answer the questions asked in 
Examples 3.1-3.5 by applying Rules 1, 2 and 3 (i.e. without applying LT). 


Example 3.1. “Mother finishes praying” at t = T;, when “P passes V” (i.e. 
when Peter has reached Vega), according to clocks in S. At the same instant 


“she sends her good wishes” to Peter. “Peter receives her message” at t = 
T3, t’ =T{, Obtain T; and T}. 


Solution. Let us first identify the events involved, and their coordinates (x, 
ct) in S and (x’, ct') in S'. We use the symbol f to mean that we are not 
interested in this particular coordinate: 


Əp = “P passes A” (x = 0,ct =0); (x’ = 0, ct’ = 0); 

Əı = “Mother finishes prayer” (x = 0, ct = eT); (x = f, ct’ = cT i); 
Ə = “P passes V” (x = D, ct = cT); (x = 0, ct’ = cT3); 

Əz = “P receives message” (x = ¢,ct = cT3); (x' = t, ct’ = cT; ). 


Note that “Mother finishes praying” and “Mother sends message” are 
the same event ©4. In time T}, the train has moved a distance D with speed 


Bc. Also, the events ©; and @>2 are simultaneous events in S. Hence, 


) Oe 
PE ee ee, 


cp 2cm/s 


The message is travelling at speed c between ©; and ©4. It is sent when P 


has already moved a distance D. Therefore, the message travels an extra 
distance D over the train which is travelling at speed Bc. Hence, 
) 0c 
cl T3 — Tı = D+ BelTs — Tı h = T3 — Tı = — -A = Wem = 50s. 


(1—8)e tem/s 


Note: None of the rules of relativity has been used in finding the above 
three answers. 


Example 3.2. Determine T;, using Rule 2. 


Solution. Note that Tj is the proper time between the events ©, and ©ọ 


(because Mr P is present at both events). We shall use Eq. (2.23) to relate T4 
and the “improper time” T3 between the same events which has been 


already found out in the last question. For this, we need the Lorentz factor 
y: 


1 l 5 > T 62.5 
y = Å— L l L => T; = — = a = 37.5 


y l— 82 E / 16 3 
y 


Example 3.3. From the time 74, Peter determines the time ť = Tj, when his 
“mother finished praying and sent the message”. Find out Tj. [Hint: Peter 
does not know relativity, but knows that his mother has been receding from 
him at the speed £ = #.] 


Solution. In the time Tj, which is the time interval between @ and ©, as 


measured in S', the point A has moved with velocity -pc (i.e. in the 
negative X'-direction) a certain distance a, i.e. from x’ = 0 to x’ = -a. 
Therefore, 


pe é 

a= cT = =cT}. 
In the subsequent time interval Tj — Tî (measured in S’), P remains 
Stationary at x’ = 0 in his rest frame S', while the message travels from x’ = 


—a to x’ = 0 with speed c. Therefore, 


c(T; — Tj) =Q = BcTy. => T; = 


1 


= 


Example 3.4. Find the time t=T; when “P passes V”. [Hint: The line 
segment AV has a contracted length. ] 


Solution. At time ť' =T; the point A has moved in the negative X’ direction 
at a speed of Bc to cover a distance D' which is the contracted length of the 
proper length D. Therefore, 
) ) c 
B=D = > > Th = = Tss. 


“ Bey #ex m/s 


Example 3.5. Summarize your answers obtained so far in the following 
tabular form: 


Duration of mother’s prayer 
according to S frame, Ti =... 
according to S frame, Tj =... 
Duration of Peter’s journey from S to V 
according to S frame, To =... 


according to S’ frame, Tj =... 


Verify that Rule 2 is illustrated in both sets of answers, namely T; = 
1T{;Ts = iT, 


Solution. The proper and “improper” times between @ and ©, are T; and Tj 
, respectively. So we expect that T? = +, Similarly the proper and “improper” 


times between ©ọ and ©, are T4 and T>», respectively. So we expect that 
T; 1 
Ty 


2 


t Rer! 
i w 
_ 


Hence, our expectations are confirmed. 


Example 3.6. If the length of the space station V is 1.2 lt-sec, how long 
does the length of Cosmos Express take to traverse the length of V (a) 
according to S? (b) according to S'? Answer by applying Rule 3. 


Solution. (a) Let t be the required time in S. In this time, Cosmos Express 
travelling with speed Bc moves a distance equal to the length of the 
platform which is the proper length d plus the length of the train which is 
the “improper” length €’. Hence 


, £ 2cm , l 
Į = = = -p = 1.2c m, ( 3.50a) 


3 


Sct = E +d =1.2c+1.2c = 2.4c m, (3.50b) 


38. (3.50c) 


(b) Let ť be the require time in S. In this time, platform V travelling with 
velocity —Bc moves a distance equal to the length of the train which is the 
proper length £ plus the length of the platform which is the “improper” 
length d’. Hence 


d 1.2c m 


d' = — = = 0.72 cm, (3.5la) 
3 
Bet! = f + d = 2c + 0.72¢ = 2.72cm, (3.51b) 
€é+td 2.72 2.72 
a ge — = 3.48 (3.51¢) 
be 5 4 


Example 3.7. Obtain the answer in (b) of the last question by applying LT. 


Solution. We have shown the two boundary points of the station V, the left 
boundary L and the right boundary R. The office V is located at midpoint 
between L and R. In order to apply LT, we have to identify the two events 
involved, namely, ©; = “E passes L” and Op = “G passes R”. The space and 
time intervals between these two events, as measured in S are: Ax = d and 
cAt = + which is same as the time t obtained in part (a) of the last problem. 
We shall obtain the values of these coordinates in S' applying the 
transformation of coordinate differences as given in Eq. (3.22) 


cAt’ = y(cAt — BAT) =4 — — ad = L, +74 (< — 5) d 
Í Oo 


;| A =| +d’ 
= = vt T= ——-_ ° 


5 Oo 


We have used Eqs. (2.22), (3.50a), (3.51a). We get the same answer as in 
(3.51c). 


Example 3.8. Muons (denoted by the symbol p ) are charged particles like 
electrons, but about 207 times heavier. They are present abundantly in the 
earth’s upper atmosphere, being caused due to bombardment of nuclei by 
cosmic rays. A muon decays spontaneously into an electron and a pair of 
neutrinos, the decay mean-life being Tọ = 2.3 x 10°© sec in muon’s rest 
frame. However, they live longer as they travel through the atmosphere with 
relativistic speed very close to c. Their prolonged life is one confirmation of 
the time dilation formula. 


A balloon is sent up to a height of 2 km in the atmosphere to measure 
the flux density of muons of average speed 0.995c. If a detector carried on 
the balloon counts 10 muons per minute, how many muons per minute, on 
the average, will the same apparatus count at the sea level? Assume that the 
muons are travelling vertically downward and that no muon is stopped by 
the atmosphere. [Hint: The number of muons fall off exponentially as e~*, 
where tọ is the mean-life and t is the time of flight of the muons, both 


measured in the experimenter’s frame. | 


Solution. Let No be the number of muons to be detected by the balloon at 
the reference point A which is located 2 km above sea level. Let N, be the 
same number at the sea level. The number of muons decreases with travel 
time t from the point A downward, and is given as N(t) = Noe~*s, where tọ is 
the mean life in Earth frame. The mean life tọ given in the problem is the 
“proper” lifetime of an average muon between the events “birth” and 
“death”. The two mean lives to and tọ are related by the time dilation 
formula. 
1 


= = |) 


y 1 — (0.995)? 


) 
to = YTo = 10 x 2.3 x 107 = 2.3 x 1075s, 


; 2 x 10° _ . 
t = travel time of y` = ———————_~’. = 0.67 x 10”, 
0.995 x 3 x 10° 


Nz = Noe~* = 10 x e~ 2? = 7.47 per minute. 


Example 3.9. Two events “A” and “B”, when viewed from S, are separated 
by a distance d = 2 It-sec and a time interval t as given below: 


(a) t = 6 sec; 
(b) t = 2 sec; 
(c) t= 1 sec. 


(i) Is there a frame of reference S', for each one of the above three cases, in 
which the events occur simultaneously? If so, determine the boost 
velocity cB of S' with respect to S. 

(ii) Is there a frame of reference S', for each one of the above three cases, in 
which the events occur at the same spatial location? If so, determine the 
boost velocity cB of S' with respect to S and the proper time between 
the events. 


Solution. For this and subsequent solution to problems in this chapter, we 

shall write AY to mean coordinate difference between two points B and A. 

Actually, this quantity is a 4-vector, stretching from the event A to the event 

B. The meaning and significance will be explained in Sec. 7.8. 

(a) Let the coordinate difference between the two events be AY = B - A. 
Then 


(ct, @) 

Ar = (6c, 2c) 
9 jn \92 fé u2 f A .\2 
As* = (6c)* — (2c)* = (4v2 c)“ 


i A 
AT = proper time = 4y/2 sec. 


The interval is time-like. Hence, we have the following conditions: 
(i) The events cannot occur simultaneously in any frame. 
(ii) Yes, there exists a frame S’ in which the events occur at the same 
spatial location. Let Bc be the boost velocity. Then, 


Av’ = 0 = (Az — BAct) 


2c ji 


There is a proper time At between the events. Let us find it 


3 1 i - 
c AT =7(Act — 8 Ar) = — | 6 — = x 2 ) c = 4y? c lt-sec. 
2V2 3 
Hence, AT = 4y? sec. 
Ar 1/2 272 l , , 
— = — = — = —, consistent with the time-dilation 
At 6 3 " 


formula (2.23). 
(b) AF = (2c, 2c); As? = 0. The interval is light-like. 
(i) No. (ii) No. 
(c) AF = (1c, 2c); As? = -3c?. The interval is space-like. 
(i) Yes, there exists a frame S’ in which the events occur 
simultaneously. Let us find the boost velocity for the frame S’. 


cAt!/=0=7(Act — B Az) 
3 Act ce 1il 
AT 2c 2 
1 2 
y 1 - 4 V3 
This is consistent with the length contraction formula (2.26): 
Ar 2 
Agr’ E V3 E 


(ii) No. 
Example 3.10. Two lightnings strike, leaving marks A and B on a straight 
railway track simultaneously according to the Station Master (Mr. S). A 
traveller (Mr T) travelling in a long Einsteinian railway train moving with 
velocity 0.6c records the timings of the events using his instruments. Let the 
distance between A and B be L = 2.4 x 10° km, as measured by Mr S. 
(a) Did the events occur simultaneously according to the measurements of 
Mr T? If not, which occurred earlier? 
(b) What is the time interval between these events according to Mr T? 
(c) What is the length of the track segment AB according to Mr T? 
(d) The lightnings also left marks on the wheels A’, B’ of the train. What is 
the length L' of the segment A’B’ (i) according to Mr T? (ii) according to 
Mr S? 


Solution. 


(a) No. 
(b) L = 2.4 x 10° km = 0.8 x 10° It-sec. = £ c x 107 m. B = 0.6 = 3/5; 


= 1 =p 
SS ee . 


(ct, x) 


A= (0,0) 


B= (0. A x 10-? 
5 


ct’, = 0 


In S: 
(3.52) 


5 3 4 2 3 9 
In T: ct’, =7(0-2% =C X 10-2) = —=-C X 1074 
¿ D » t 


3 e 
t Pe. = S 
tg — tA = — 10 sec, 


The time interval ¢’, — ¢', is negative. Therefore, B occurs before A. 

(c) The ground segment AB is at rest with respect to Mr S. He measures the 
proper length L for this segment. In contrast, Mr T sees the ground 
segment AB to be moving to the left. Therefore, he measures the 
improper length L’ for the same segment. 

By Eq. (2.26): L = yL'. Hence L' = L/y = 2.4 x 10° x 4 = 1.92 x 102 
km. 

(d) The train segment A’B’ is at rest with respect to Mr. T. He measures the 
proper length £ for this segment. In contrast, Mr. S sees the train 
segment A’B’ to be moving to the right. Therefore he measures the 
improper length L for the same segment. 

By Eq. (2.26): £= yL = = x 2.4 x 10° x $ =3 x 10° km, 

Note that £ = 7, as per Theorem 3.1. 


Example 3.11. An event “A” = (0, 0, 0, 0) is followed by a second event 
“B” = (1, 2, 0, 0), 1 sec later, the coordinates being measured with respect 
to a frame S in lt-sec. Is there a frame of reference S’ in which 

(a) “A” and “B” are simultaneous? 

(b) the event “B” is followed by the event “A”, 1 sec later, i.e. t4—-tg = 1? 

(c) the events “A” and “B” occur at the same spatial location? 


Find the boost velocity cB of S' relative to S in each of the above cases (if 
such a frame S' exists). 


Solution. A? = (1, 2) in S. Hence, As? = 1 — 4 = —3. The interval is space- 
like. 
(a) Yes. 

In the second frame At' = 0. Hence, 0= 7(1-—8 x2) = 8=4. 


Fig. 3.11. Rod falling vertically. 
(b) Yes. To find this frame, note that cAt' = —1. Hence, 


—1=7(1- 6x2) 


1 
Or, -—=1-28 


(b) No, because the interval is space-like. 


Example 3.12. Consider a straight horizontal rod moving vertically 
downward with uniform speed V as seen from frame S' (see Fig. 3.11). Its 
ends A and B will therefore strike the horizontal floor (represented by the 
X-axis) simultaneously according to S'. However, when seen from S, the 
ends strike the floor at different times, and, therefore, the bar is not 
horizontal. Find the angle that the bar makes with the horizontal (measure 
positive angle anticlockwise from the X-axis.) 


Solution. To find the angle @, we need to know the coordinates of the ends 
A and B at the same instant of time (i.e. simultaneously) in S. 
We know that the two events O,p = “A touches P” and Ogo = “B touches 


Q” are simultaneous in S'. Their coordinates in S' and S are as follows: 


(ct, 2), (ct, 2), 
Jap = $ (0,0) in S, Ogo = § (0,2) in S’, 


(0.0) in S, (ctag: Tng) in S. 


Using inverse Lorentz transformation equation (3.9), we get the values of 
(ctgQ, XBQ) in S: 


Clag = Y0 + GL) = y pE, 


Tego = YE + 0) = y£. 


Let us look at the rod from the frame S. At ct = 0, the left end A of the 
rod is at the origin, i.e. at the spatial coordinates (0, 0). Let the right end B 
of the rod at the same instant be at the spatial coordinates (x0, yo). At ctBQ, 
the right end of the rod is touching the floor. Therefore, in time {ctgg = 
yb}, the end B has undergone a spatial displacement from (Xp, yo) to (y£, 0). 

Seen from S, all points of the rod have a velocity v. The horizontal and 
the vertical components of this velocity can be found from the velocity 
addition formula (4.23): vs = 8:0, = = = -c%. We have set v as the 
dimensionless velocity, like p. V = cv. 

Therefore, 

To + (=) (Be) = yf => zo = (1 — 8?) = 7 


ypt V 
Yo + — a = yus Yo = BEV. 
C N 


Therefore, tan ¢= = = By, 

Example 3.13. Distant galaxies are known to be receding from ours at very 
large speeds, as evidenced by the shift in the frequencies of the spectral 
lines from atoms towards lower values, a phenomenon known as red shift. 


A distant galaxy is recognized to be receding at a speed of 0.5c along the 
line of sight, from the measurement of the red shift of the sodium D> line of 


wavelength 5890 A. Find the measured value of the wavelength. 
Solution. Use the Doppler shift formula (3.31). Note that + = “, and 


B =3;7 = 30 =0°. Therefore, 


A =7(1+ 8cos0°)A, = V3 x 5890 = 10,201 A. 


aSee Ref. [34, Sec. 2.8]. 


Chapter 4 


Relativistic Mechanics 


The previous chapter elucidated the relativity principle by first enunciating 
the basic postulates and then showing their immediate consequences. One 
of these consequences, namely the Lorentz transformation, has played a 
crucial role in reshaping classical mechanics by redefining the fundamental 
quantities, like energy and momentum, and then modifying Newton’s 
equations of motion. We propose to sketch in this chapter some of these 
revolutionary developments and structure relativistic mechanics as an 
outgrowth of Lorentz transformation. Central to this development is the 
velocity addition formula, which forms the topic of the next section. 

We shall adopt the following convention. The boost velocity, i.e. the 
velocity of S’ with respect to S will be denoted by u = c& and the velocity of 
a moving particle will be denoted by v = cv in S and v' = cv' in S’. 


4.1. Relativistic Form of Velocity Transformation 


The relativistic velocity transformation formula, generally known by its 
popular name Velocity Addition Formula, is a relativistic generalization of 
the Galilean velocity transformation formula shown in Eq. (2.8c). We shall 
first obtain the simplest special case by considering the boost:S(cf, 0, 0)S’, 
corresponding to the standard configuration shown in Fig. 3.1. In Fig. 4.1, 
we have shown a moving particle, and its radius vectors r and r’ and its 
velocities v and v’, both with respect to the frames S and S'. 

Let us consider a particle moving along a certain trajectory which 
appears as È to S and 2’ to S' (Figs. 4.1(b) and 4.1(c)). Consider two points 
A and B on the trajectory of the particle and lying infinitely close to each 
other. The radius vectors of these points are r andr + dr in S andr’ andr’ + 


dr’ in S'. These points are reached at times t and t + dt in S and ť and ť + dt’ 
in S'. We can, therefore, say that “the particle arrives at A” and “the particle 
arrives at B” are two events, which can be designated by the symbols “64” 


and “Op”, respectively, and whose coordinates are as follows: 


Fig. 4.1. Trajectory of a particle in the frames S and S’. 


. (ct,r) = (ct, 2, y, 2) in S, 
“Oa = : 
(ct’,r’) = (ct’,2’,y',z') nS, 
(ct + edt,r+dr) 
F = (ct + cdt, x + dx, y + dy,z + dz) 
“Op” = 


(ct’ + cedt'.r' + dr’) 


= (ct' + cdt',x' + dr’, y' + dy’, 2' + dz") 


The coordinate differentials transform according 
transformation, as pointed out in Eq. (3.22). Therefore, 


edt’ = 4(cdt — 8 dr), 
dx’ = y(dr — Bedt), 
dy! = dy, 

dz’ = dz. 


(4.1) 


in S, 


in S”. 


Lorentz 


(4.3a) 
(4.3b) 
(4.3c) 


(4.3d) 


Let the velocity of the particle at the event “©” be v = = as it appears to 
S, and v' = as it appears to S'. The Cartesian components of these vectors 


are as follows: 


Fej ' dx dy dz 
y= (va, v, v) = | —. —. — }, 
om oe dt’ dt’ dt 


‘dx’ dy’ dz’ 
v= (v, u uv) = , d .— |. 
a dt’'’ dt> dt’ 


Note that, in order to determine the velocity components in a given frame of 
reference, we have used the length and time intervals pertaining to that 
frame of reference only. We can now utilize Eq. (4.3) to connect the two 
sets of velocity components as follows: 


(4.4) 


dx’ = ey(dx — Bedt) 
U — Tl ey 
i dt' y(cdt — 8 dx) 


’ 
= =———— — — f (4.5) 


where u = ch. In the same manner, we can obtain the transformation 
equations for the other components. Writing them together, we get the 
velocity transformation for this special case, namely, boost: S(v, 0, 0)S' as 


(4.6) 


Alternatively, if we had started from the inverse transformation 
equations (3.9), we would have obtained the following result, which is the 
inverse of Eq. (4.6): 


(4.7) 


Equation (4.7) effectively shows the result of adding velocity u to v, 
whereas Eq. (4.6) tells us the result of subtracting u from v. These results 
are distinctively different from their Galilean counterparts. These equations 
are often referred to as the velocity addition formulas (subtraction of u is 
same as addition of —u). 


We have now to verify that the transformation equations (4.6) and (4.7) 
satisfy two essential requirements of all relativistic results, namely that (i) 
they must reduce to the corresponding non-relativistic result in the limit of 
small velocities; and that (ii) if a particle is moving with speed c inthe 
frame S', then it should move with the same speed c in the second frame S, 
and vice versa (so that light propagates with the same speed c in both 
frames). With regard to requirement (i), let us record here the non- 
relativistic limit criterion 


Whenv<e, 8-0, y—1 (N.R. limit). (4.8) 


Applying the criterion (4.8) to Eq. (4.6), so that =£ — 0, one gets 


/ 1 Ul j \ 
U.=Uz—U, U,=Vy, UV, = Vez. (4.9) 


This is the Galilean result (2.5). 
To meet the requirement (ii), we set v' = (c, 0,0) in Eq. (4.6) and get 


e+u 0 0 
U; = = ĉ, Uy = — — |), r = = WU, (4.10) 


so that v = (c, 0, 0). 

We shall now illustrate the velocity addition formula by showing that 
the speed of light c is the ultimate speed, i.e. no material particle can be 
accelerated beyond the speed of light. Suppose a particle is moving along 
the X-axis with speed 0.99c relative to the laboratory frame S. Imagine a 
second frame S' which is also moving along the X-axis with velocity 0.99c 
relative to the Lab frame so that the particle is at rest in S'. The particle is 
now accelerated (from the present zero velocity in S') to velocity 0.99c 
along the X'-axis in this second frame S'. According to the velocity addition 
formula (4.7), therefore, the resulting velocity of the particle in the 
laboratory frame S would be 


0.99e + 0.99¢ 


v = = 0.9999494c, along the X-axis. 
1+ 0.99 x 0.99 


Therefore, our velocity addition formula effectively tells us that if we add 
velocity 0.99c to 0.99c, we get 0.9999494c, instead of 1.98c. However hard 


we may try to increase the speed of a particle, which has nearly reached the 
speed of light, it will still be never able to reach, or surpass, the speed of 
light. 

We can make the argument more convincing by analyzing the motion of 
a particle which is under a “constant acceleration” a in its instantaneous rest 
frame (to be abbreviated as IRF) along the X-axis. The term instantaneous 
rest frame — also called the comoving Lorentz frame — means an inertial 
frame in which the particle is, at least momentarily, at rest. We shall explain 
the concept of IRF with the diagram shown in Fig. 4.2. 


Fig. 4.2. Explaining Instantaneous Rest Frame, shown as S'. 


Imagine an accelerating particle, e.g. a rocket. Its own frame of 
reference (i.e. the frame in which it is always at rest) is an accelerating 
frame. At a certain event, say “®©: rocket passes a space station A”, the 
rocket releases an iron cage with X'Y'Z'-axes welded on it. We label this 
frame as S'. This frame S' is not in acceleration. Hence it is an IF whose 
velocity is the same as that of R at the event “©”. Then S' is an IRF of R at 
the event “©”. We can establish a Lorentz connection between S and S' 
through the boost {S(6, 0, 0)S’}, where B = v/c. The following paragraphs 
will amplify the concept of IRF. 

In Fig. 4.3(a), we have shown the rectilinear trajectory of a particle 
accelerating along the X-axis, as seen from the laboratory frame S. The 
particle passes space stations A and B with speeds v and v + dv at times t 
and t + dt. Call these events ‘@,” and “Op”, respectively. 


Figure 4.3(b) shows the IRF S’ of the particle at the event “@,”. In this 
frame, the events “©,” and “Op” take place at the times t' and t' + dt’, the 
particle’s speed at these events being 0 and dv’, respectively. 


du’ = adt’. (4.11) 


According to the velocity addition formula (4.7), therefore, 


dy’ ; v dv’ T 
v+ dv = = ~ (du +v) (1 — +) ~ot (: — =) dv’, 
1+ vav C4 CE 


ma 
c 


trajectory 


View from the Lab frame View from the comoving Lorentz frame 


(a) (b) 


Fig. 4.3. Lab frame and comoving Lorentz frame. 


ye 
dv = (1-5) aa. (4.12) 
re 


Applying the LT to the coordinate differentials (using the inverse of Eqs. 
(4.3)), we get 


so that 


l 5 
6 dt = Å- (car + ba dz’) . (4.13 ) 
i= j 
yi- 


Now, the particle, which was momentarily at rest at t, has covered the 
distance dx’ under a constant acceleration a in the subsequent time interval 


dt’. Therefore, dx’ = Sa(dt’)? ~ 0, so that, from (4.13), 
dt = —— it’. (4.14) 
We shall introduce a new Lorentz factor, namely one associated with the 


velocity of a particle, and call it dynamic Lorentz factor, to be symbolized 
by capital gamma: 


(4.15) 


Note that the time interval dt' is measured in the IRF. Hence it is the 
same as the proper time dt between the events “©,” and “Opg”. By virtue of 
(4.15), Eq. (4.14) can be written as 


dt = Tdť =T dr, (4.16) 


which is the same as the relation between proper time and improper time 
given in Eq. (2.23). Equation (4.16) will find many applications in the 
discussions to follow. 

Compare the upper case F of (4.15) with the lower case gamma y 
defined in Eq. (2.20) to represent the Lorentz factor associated with the 
boost velocity, which we had called boost Lorentz factor. 

We can now rewrite Eq. (4.12) as 


dv = (1-5) car (4.17a) 
a 
vs “ a 
= (1-5) adt = zz dt. (4.17b) 


It is seen from Eq. (4.17b) that the acceleration of the particle, as seen 
from the laboratory frame, is constantly reducing, and is given as follows: 


/ 9 \ $ 
du v~\ * a . 
üLab = == = l - > a = =, (4.18 ) 
í ` K 


Note that as the particle reaches relativistic speed, finally approaching c, 
T > œ, and aa `> 0. Even at a constant acceleration in its IRF, the particle 
does not accelerate at all in the Lab frame, as it approaches the speed of 
light. The formula (4.18) therefore safeguards one of the principles of 
relativity, that is v Sc. 

If we rewrite Eq. (4.17) as 


dv 

—z7 =a dt (4.19) 
(1-3)? 

and integrate, subject to the initial condition: v = O at t = 0, we get the 

expression for the velocity in the Lab frame S: 


-n = at. (4.20) 


The result shows that a particle under “constant acceleration in its 
instantaneous rest frame” will asymptotically approach the speed c in any 
IF, say S, but will never reach it. See Fig. 4.5 below. 

The presence of c* in the denominators of Eqs. (4.6) and (4.7) make 
them appear somewhat clumsy. To make them look neater, we shall rewrite 
them in terms of dimensionless velocities B, v, defined as 


u = cB =c(5.0,0). vows C (Vr, Vy, V,). (4.22) 


With these modifications, Eqs. (4.6) and (4.7) take the following 
dimensionless forms: 

' = d — v = PEE... (4.23a) 
y(1 — rvp) z y(1 — vp) 

vi+ 8 Vy v! 


y(1 +48)’ — 41+ i 8) 


(4.23b) 


4.2. Relativistic Form of Acceleration Transformation 


Let 
dv dv ee) es a 
a = — = Ce = CÙ = C (V, Uy, V,), 
dt dt Hii ton asa 
, (4.24) 
, dv dv’, ee, ak 
a = = = Ce E= a = CWA, VY, Vz) 
dt! dt! y z 


represent the acceleration vector and its Cartesian components in S and S’, 


respectively. Note that the “dot” () means + in the first line and 4 in the 
second line. Then 


dv! dv! / dt 
v! = e— = OO, I= 2,2. (4.25) 
i cdt’ cdt'/dt 


Differentiating (4.23b) with respect to t 


dv’, v,(1 — 8?) 


dt (1—v,8)? 
= Vz 
~ 92(1 — py, 8)?’ 
f mas (4.26) 
dv,  Vy(l— Vb) + Vv, 
dt y(1 - vab)? l 
cat’ 
Using (4.3a) —— = ye(1 — Brz). 
dt 


We wrap up (4.25) and (4.26) to obtain the following acceleration 
transformation formulas: 


° Vz fs ore 
vi = = (4.27a) 


(1 — v.28) 


V,(1 — v3) + VyVz 8 


Y y2(1 — v,/3)3 
. Vll — Vr) + VV B 
Oo n |Á— (4.27c) 


y2( 1 — v,3)% 


Let us specialize (4.27) to rectilinear motion, along the X-axis. In this 
case, only the line (4.27a) is relevant. We drop the subscript “x”, and write: 


y= — for boost: $(3,0.0)S". 
(4.28 ) 


y= —— for boost: S’(—8,0,0)S. 
ya + v’8))8 


In the second line, let S' represent IRF. Then v’ = 0. We get 


Ù = =—— = — (4.29) 


as in (4.18), except that now the acceleration œ” in the IRF is variable. 


4.3. A New Definition of Momentum 


One of the consequences of the GT, as we saw in Sec. 2.2.1, is that the 
acceleration a of a particle is the same in all inertial frames. This invariance 
of acceleration is lost under the LT, as shown in Eq. (4.27). An immediate 
consequence of the frame-dependence of acceleration is that if we express 
Newton’s second law of motion as F = ma, and treat the mass m as 
invariant, in line with Newtonian assumption, then force also becomes 
frame dependent. A constant force F in one inertial frame would then 
appear as a variable force in another inertial frame. 

A more serious consequence of the velocity addition formula is that the 
laws of conservation of energy and momentum, which were seen in Sec. 
2.2.2 to be valid in all inertial frames under the GT, now appear to break 
down under the LT. It turns out that if the defining expressions for energy E 
and momentum p are modified to meet the requirement of the ultimate 
speed c, then the frame independence of the conservation laws can be 
restored. 


Fig. 4.4. Symmetric collision 


In the rest of this section, we shall consider a thought experiment to 
seek a pointer to the definition of linear momentum. With this in mind, let 
us consider the following example. An observer S watches an elastic 
collision experiment with two billiard balls A and B, each of mass mọ, and 
assumed to be moving with relativistic speeds (Fig. 4.4(a)). For the sake of 
simplicity, we assume the collision to be taking place on the XY-plane. The 
velocities of the ball, before and after the collision, are denoted as u4, ug 
and va, Vp, respectively. We have assumed that the initial speeds of the two 
balls are equal and that they make equal angles with the X-axis. A straight 
forward application of the energy-momentum conservation equations (2.10) 
and (2.11) would lead to the symmetry relations: 


UA = Up = V4 = UB. (4.30) 


Of the above four velocities, only va has both X and Y components 
positive. We shall write these components as cv, and cvy, respectively, so 
that va = (Vx, vy ) is the dimensionless velocity of A after the collision. 
Referring to Fig. 4.4(a), we can now write the following components for the 
velocities involved: 

Ua = C(Vz,—Vy), VA = C(Vz, Vy), 


(4.31) 
UB = C|—Vr, Vy), VB = C(—Vz, —Vy). 


Now imagine another observer in a frame S' which is moving relative to 
S with speed ug, = CV, in the direction of the X-axis (Fig. 4.4(b)). Using the 


velocity addition formulas (4.23), we obtain the following components for 
the velocity vectors in S’: 


u' — UAxs — UAx _ Vp — Ver o -_ 
Az 1-u3,/2A 1-2 
u’, = “Ay =Y c 
Ay 4(1— u/c?) y(1—v2)° 
v’ — VAr — UAz 25 Vz — Vr = 
As + a asc la aY 
1 — VAztAz/C 1-—vs 
i = VAy i 
A —_= 7 > ITS 7 m — = a 
Y Y (1 — UAsUAz/ c?) y(1 = v2) 
(4.32) 
u’ UBs — UAxs _ —2)V, a 
Bz 1—vp,taz/C@ 1+ v2’ 
up, = Wy = 
4 il — u4, / 2) y(1 -+ v2) 
y’ — UBs — UAs _ — 2V; c 
Bz 1-vasttas/ 1+v2° 
f= UBy _ Vy c 
By ~~ Aft a. a. LOA wll Le” 
WT — tthe CO} y(1 + v2) 
where 
1 
y= —_——. (4.33) 
Y1- v2 
Hence, 
V, V. 
uA = C (o. | i VA =c¢ |0, | 3 
(1 — v2) (1 — ve) . . 
(4.34) 


C V. C V, 

/ ‘ y / r y 
u g = —2v,.— |, V B = —— | —2vr. —— |. 

ae ( y 1+v2 Ay 


We shall now put the non-relativistic momentum expression: 


p=m.v_ (Not Valid in Relativity) (4.35) 


to the momentum conservation test in the frame S' (conservation of 
momentum and energy in S had been assumed for obtaining Eqs. (4.30)and 
(4.31)). From Eqs. (4.34), the total momentum of the system in the frame S’, 
before and after the collision, are seen to be as follows: 


sini | | 
pp“ la = m,| u's + Up ) 
Jy V V 
— , as y y 
Ehia (o- ap aay tat) 
T TN a/ À T) 
2v, 2v2v, 
=m,c | -——>, -———_ ]. 
1+v2’ y(1-v4) 
(final) ft ny 
P = Mol VA +VB ) 


2Vz Vy Vy 
=m,c | U0—- SSS | sa 
lve lve). le) 


QV, 2v2v, 
=>m,cec4—- => a . 
Lee y1 — vrz) 


Thus, p6» 2 pitia) suggesting that the momentum conservation law 
breaks down if we adopt the definition (4.35). This calls for a revised 
definition for linear momentum. 

One may get a clue along the following lines. The transformation 
u=(4 H 4) su = (gg) due to change of reference frame is a 


double transformation in the sense that the LT converts both the numerator 
dx and the denominator dt according to Eq. (4.3). It might have been better 
to adopt instead of the coordinate time dt, some other measure of the time 
interval which would remain invariant under the LT. The natural choice 
would be the proper time interval dt of the particle’s own motion. 

Referring to the trajectory of the particle shown in Fig. 4.1, we can 
think of a clock attached to the particle and being carried with it. The timing 
read on this clock is the proper time of the particle’s own motion (proper 
time, because the particle’s clock is present all along its own trajectory). 
This clock reads the timings at the points A and B (i.e. at the events @, and 
©p) to be t and t + dt, respectively. Therefore, the proper time interval 
between ©, and Opg is dr. Dividing the spatial intervals between these 
events by this proper time, we get a new measure of the velocity, which we 


shall call proper velocity and write as Vprop. The components of this velocity 
in the frames S and S' will be 


dx dy dz dx’ dy’ dz‘ ver 
Vprop = —_e = i y’ op — r i oo CU i (4.36) 
prop dr’ dr’ dr iin dr’ dr’ dr i 


respectively. 
Since a comoving inertial frame of the particle at the event ©, is 


moving with velocity v with respect to S, and v’ with respect to S’, it follows 
from Eq. (2.23) that 


dt dt’ (4.37) 
“=T { „5i ) 


where 


T= oes. |” = — (4.38) 


Vive vive 


are the dynamic Lorentz factors in the frames S and S', respectively, as 
defined in Eq. (4.15). Therefore, from Eqs. (4.36) and (4.37), 


: rs J _ Ty’ lA 20) 
Vprop = [V, V prop =I" (4.39) 


Hence we try the following definition for momentum: 


P = Mo Vprop = [ M.V. (4.40) 


For non-relativistic speeds, [ > 1, so that the definition (4.40) converges to 
(4.35). 

We shall now show that this new definition (4.40) will ensure 
momentum conservation in the example illustrated through Fig. 4.4. For 
this we shall first compute all the [’-factors, in the S’ frame, corresponding 
to the velocities written in Eq. (4.34). It is easily seen that the I’-factors 
corresponding to u4 and v'a are equal and therefore can be represented by a 
common symbol T4. Similarly, the [-factors corresponding to ug and v'g are 
equal and can be represented by another symbol Ip. These factors can be 
computed with the help of Eqs. (4.34) and (4.33) as follows: 


Í v’? v? v 1 — v? — ry? 
—— =l- —4 DE pEi E = 1 —- —L_ = y 
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| Kia CTAR c2 (1 +r?) T 2 


1 Saa 
~ O42)? [(1 + vi + Wz) — {45 + (1 -vzv }] 


(1 —v2)(1-— v2 — v?) 
(1 +2) 


Therefore 


,_ | 1-0 _ (r3) 
A 1—v2—-v? ,/1—v2— v2 
(4.41) 
: (1+ v2 ) yilt v? ) 
Tg = -e = 


(1-r2)\(1-v2-v2?) /1-r -rn 


The new definition (4.40) yields the following momenta for A and B 
before the collision. 
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Hence, the initial momentum of the system is 
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Similarly, the momenta of the particles after the collision are as follows: 


j 2) 
(final) 1 — vz) V, 
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Hence, the final momentum of the system is 


2s 


|) mC. (4.43) 
/ 2 — y2 
y lEn u 


p’(fnal) = 


From Eqs. (4.42) and (4.43), we find that the new definition (4.40) of 
linear momentum will conserve linear momentum in both S and S’. In Sec. 
4.6, we shall prove this statement for a more general collision, and in Sec. 
8.4, we shall arrive at the same definition of momentum in a more natural 
way while discussing 4-momentum. For the time being we shall confirm 
this new definition through the following equation: 


p = [m,v =T'm, cB, (4.44) 


where v is the velocity of the particle. We have used a subscript 0 under m 
to underscore the fact that m, is the intrinsic mass, often also called the rest 


mass of the particle, for reasons that will be explained. 


Frequently, the I'm, is called the relativistic mass. We shall reserve the 
symbol m (i.e. m without subscript) to mean relativistic mass: 


m, ae 
m= I'm, =. (4.45) 


yi- 


In contrast with m, the rest mass m, measures the inertia of the particle 
when the particle is momentarily at rest, i.e. the inertia of the particle when 
the particle is either at rest, or is being accelerated from its (momentary) 
rest position. We can alternatively express momentum in the old style: 


p = MV, (4.46) 


where the mass m appearing now is, however, relativistic mass. It is a 
variable mass and its relation to the invariant rest mass mM, is given by the 
expression (4.45). 

Later in this chapter (see Eq. (4.79)) we shall show that the “total 
energy” E (sum of rest energy and the kinetic energy) of a particle is 
proportional to the relativistic mass m. In fact E = mc’. Therefore, m is also 
a measure of the particle’s energy. Hence a plot of m vs. p is also a plot of E 
vs. B 

Variation of momentum p and energy E with velocity has been plotted 
in Fig. 4.6, below. In this figure, we have indicated the relativistic 
momentum p using the relativistic formula (4.44), and the non-relativistic 
momentum pyp, obtained by using the N.R. formula (4.35). It can be noted 
that the two plots are almost identical in the region p $ 0.3, suggesting that 
we can use N.R. formulas for momentum and energy all the way upto ~0.3 
c, without causing appreciable error. 


4.4. Force 


Newton’s second law of motion is a law as well as a quantitative definition 
of force. Following this law, the force F acting on a particle is measured as 
the rate of change of its momentum p: 


p- dp dil'm,v) _ 


— dt dt 


where m is the relativistic mass as defined in Eq. (4.45). 


(4.47) 


With the definition of force as adopted in Eq. (4.47), the proportionality 
between force F and acceleration a is lost. In particular, F and a are no 
longer parallel. We shall establish a relation between the two, as shown in 


Eq. (4.82) below. 


We shall now illustrate the second law by working out the effect of a 
constant force F on a particle of rest mass m, which is at rest at t = 0. Since 


the resulting motion is one dimensional, we shall drop the vector symbol. 


From Eq. (4.47), 


dp 
dt 


Integrating, and using the initial condition: p = 0 when t = 0, we get 


p= Ft. 


However, from Eq. (4.44) 


p=I'm,v =T8m,c, 


where f and T are defined in Eq. (4.15). From (4.49) and(4.50), 


= F, a constant. 


(4.48) 


(4.49) 


(4.50) 


(4.51) 


(4.53) 


Simplifying, we get 


c 
v = eD S ÅÁ—p. (4.54) 


1+ (Ee) 


Note from Eq. (4.54) that as t > œ, B — 1. That is, the velocity of the 
particle approaches c asymptotically with time. Also note that the result 
shown in Eq. (4.54) is identical with that in Eq. (4.21), if we set a = =. The 
convergence of the two results suggests that a relativistic particle under a 
constant force is undergoing a constant acceleration in its instantaneous 
rest frame. 

We can integrate Eq. (4.54) to obtain an expression for the distance x 
covered by the particle in time t: 


pt pt pt . lt 
T | vdt = ef S8dt = | a. E (4.55) 
0 0 0 a 


vi+ HE) 


or 


me | | +( — ) 1 (4.56) 
r= | 4/ — — . (4.56) 
F |y 


We can make the formulas (4.54) and (4.56) look less formidable by 
defining a characteristic time 


Tlo C fa e) 
— (4.57) 


Formulas (4.54) and (4.56) will now take the following easier looking 
forms: 


v= e6 = —_—_—_——_. (4.58) 


| t\~ 
T =CTo y 14 (=) =. (4.59) 


All relativistic formulas should satisfy one important test, namely they 
should yield the corresponding old familiar results in the non-relativistic 
limit: 


N.R. limit > væ ece. 830, TO. (4.60) 


On the other extreme we have the ultra-relativistic limit. 


U.R. limit > vee 831, Too. (4.61) 


In the present case, the criterion (4.60) is realized during the early times t 
>> T, when the speed gained by the particle is still small compared to the 
speed of light, and the momentum Ft imparted to it by the force F is small 
compared to m,c. Applying this criterion to Eq. (4.56), it is easily seen that 


if FE" 1(F\» 
1+-|(— -1| = =- | — | t^. (4.62) 
2 \ M,C 2\m, 


Here F/m, is the acceleration a, so that we get back our familiar non- 
relativistic kinematic result: z = at”. 
If we apply the approximation (4.60) in Eq. (4.54), we get 


which is again the old familiar non-relativistic result. 

Before leaving this context it can be useful to carry a “numerical feel” 
of the relativistic formulas just derived, by considering two extreme 
examples of commonly encountered forces. In the first example, we shall 
consider a particle subjected to a constant force equal to the force of gravity 
(as experienced near the surface of the earth). The force on the particle is 
then F = mog, where g = 9.81 m/s*. Therefore, the characteristic time is 


M,C c 3x10®m/s 


—_ > 
F g 9.81 m/s“ 


3.06 x 10’ sec = 354 days ~ 1 year. 


If we measure time in years, then Eqs. (4.58) and (4.63) will give: 


V t g 
8 = — = ——_ (General result), (4.64) 


8B=t (N.R. result). (4.65) 


The pre-relativistic result (4.65) tells us that the speed of the particle 
should increase linearly and would equal the speed of light in 1 year, twice 
that speed in 2 years, and so on, and will keep on increasing without limit. 
The relativistic formula (4.64), on the other hand, predicts a linear increase 
in speed in the beginning, followed by an asymptotic approach to the 
ultimate speed c at the end. It predicts a speed of 0.71c at the end of 1 year, 
0.89c after 2 years, 0.95c after 3 years, and so on, so that the rate of 
increase slows down as time progresses. The v — t relationship is illustrated 
in Fig. 4.5. 
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Fig. 4.5. Velocity ß and displacement x as functions of time t under a constant force. 


We have just seen that it takes 1 year to reach 71% of the speed of light 
under a force equal to the terrestrial force of gravity. Therefore, if a rocket 
engine keeps firing continuously to generate a thrust equal to the weight of 
the space vehicle for one full year, it will come somewhat close to the speed 


of light. The above example therefore serves to illustrate how impossible it 
is for a space traveller to attain a speed anywhere near that of light. 

If, however, the above example suggests that special relativity has 
nothing to do with terrestrial experiments and down-to-earth applications in 
physics, then the following example may correct such a notion. 

Linear accelerators of earlier designs like the Van de Graff accelerator 
and Cockcroft Walton accelerator were used mostly to accelerate protons, 
deuterons, alpha-particles and other ions. Each of these machines can be 
designed to generate a potential difference of several million volts across 
the length of a tube called the accelerating tube. The charged particles 
required to be accelerated are allowed to fall through the potential 
difference between the ends of the tube. 

Let us consider a Van de Graff accelerator having a length of 1 m anda 
potential difference of 5 millions volts. We shall use this machine to 
accelerate an electron. We shall, therefore, need to calculate the velocity 
attained by the electron after traversing the length of the tube and the time 
required to cover this distance. It is given that the rest mass of an electron is 
Mo = 9.11 x 10°! kg. Charge of the electron is e = 1.6 x 107}? coul. 
Therefore, the force on the electron is 


F = eE = 1.6 x 107}? x 5 x 10° = 8 x 107! N. 


The characteristic time is 


9.11 x 10731x 3x 108 . ... An 
To = —————— = 3.416 x 107 sec. 
8 x 10713 


From Eq. (4.58), we can obtain the speed of the electron at different 
times as follows: 


v = 0.707c at t= To = 3.416 x 107!" sec, 
v = 0.89¢ at t= 2T, = 6.832 x 10710 sec, 


v = 0.95c¢ at t= 3T, = 10.248 x 10-19 sec. 


We have plotted p} vs. t and x vs. t in Fig. 4.5. The time axis has been 
calibrated in the unit of tọ. In Fig. 4.5(a), we have indicated the velocity 


achieved after time intervals of Tọ, 2T,, 3Tọ. In Fig. 4.5(b), the vertical axis 


represents displacement x, and has been calibrated in the unit of Cto. We 
have indicated the displacement x at times To, 2T,, 3To, 4To, 5ST). It can be 
seen that after time t = 2T,, the x — t graph is almost a straight line. 

The time required to cover the distance of 1 m can be calculated from 
Eq. (4.59) using the value of t,. It works out to be t = 10.711, = 36.59 x 


107}? sec. The velocity of the electron after accelerating through this 
distance is now obtained from (4.58) to be 0. 99565c. 

Note that the electron gains a speed equal to 99.56% of the speed of 
light by falling through a potential difference of 5 million volts. The kinetic 
energy gained by the electron in this process is 5 million electron volts, or 
5MeV. We shall take up the concept of relativistic energy in the next 
section. 


4.5. Energy 


Let us review how an expression for kinetic energy can be derived in non- 
relativistic physics. For simplicity, we first consider a particle of mass m in 
one-dimensional motion under a constant force F. The acceleration is a = 
F/m and is a constant. If the particle starts from rest from the origin and 
moves along the X-axis, then we have the familiar kinematic equation for 
the velocity v reached after traversing a distance x: v? = 2ax, so that 
Fr = maz = imu’. 

The quantity Fx is the work done by the force F in displacing the 
particle through a distance x. Using the conservation of energy principle, 
one may argue that this quantity Fx of potential energy must have been lost 
by some agent (which may a gravitational or electric field, a machine, 
human muscle, or something else) and has been transferred to the particle in 
the form of its kinetic energy. Therefore, the kinetic energy K of the particle 
is as follows: 


K = work done in raising the velocity of the particle from 0 to v 


7 l 9 Z 
= Fr = smv“. (N.R. result) (4.66) 


For obtaining the same result from a more general force F which has 
arbitrarily variable magnitude and direction, we proceed as follows: 


. Ed 
K = work done = J F -dr = m | — - vdt 


. i > i A 
= m | v- dv = =m / d(iv.v)= | d (5m?) : (4.67) 
2 2 


In the above equation, the integration is carried out over the path along 
which particle has moved. Since the velocity of the particle is assumed to 
be zero at the beginning of the path, the above integration gives the same 
non-relativistic result, namely 


- 1 2 EE 
K = smv” (N.R. result). (4.68) 


-_ 


We shall now apply a similar argument to a relativistic particle. To start 
with, we first consider the simplest example, namely a constant force F 
applied on a particle along the X-axis. The work done in moving the particle 
over a distance x is, as before, Fx. In this process, the velocity of the 
particle increases from 0 to v. It is seen from Eqs. (4.53) and (4.52) that 


phi td | re 
| t 3? [232 (4.69) 
[28? 


r? =1+ Z] (4.70) 


It now follows with the help of Eq. (4.56) that 


K = Fr = (T - Ime. (4.71) 


One can derive the above relation for a more general case. Consider the 
applied force F to have arbitrary magnitude and direction so that the 
particle will now move along a more general curvilinear trajectory. Using 
the defining equations (4.47) and (4.44) for force and momentum, we get 


*d(Tv) 


vdt. (4.72) 
dt 


W = work done = / F .dr =m, / 


The right-hand side of (4.72) will readily transform to the form shown 
in (4.71) with the help of the following identity: 


d(Tv) . 2 dr 


>r Y= T (4.73) 
To prove Eq. (4.73), we first note that 
: o” (4.74) 
— — — — (4.14) 
r2 c2 . ‘ 
Differentiating either side, we get 
2c? dr dv?) (4.75) 
TS dt dt aii 
Therefore, 
di(Tv) ar rv dI +: T d(v?) 
ms « V ae YY. V aa = V = 7) ao cms am 
dt b dt EENT dt y dt 2 dt 


o2 ce | dr 
Se TET 
Also note from (4.74) that »?+ = c?. Hence the identity (4.73) is proved. 


(QED) 
From Eqs. (4.72) and (4.73), we have 


Ir 
W = m,e J dl = (T — 1)m,c?. (4.76) 
Jı 


Here we have used the fact that at t = 0, v = 0 so that [ = 1. The kinetic 
energy is therefore given by the earlier expression (4.71). We rewrite this as 
follows: 


K=W = (T -1)n.c’. (4.77) 


We shall show that the above expression for kinetic energy indeed 
reduces to the non-relativistic expression (4.68) under the condition (4.60) 


(4.78a) 


Hence, 


= > 1l , l _ 
K =(T-1)m.c2 x 5mp c? = MoU. (N.R. result) (4.78b) 


The quantity E = K + moc? 


particle” having velocity v: 


is called the total energy E of a “free 


E=K +m,c? =Tm,c?, 


: > (4.79) 
or E=mec, 


where m is the relativistic mass, as defined in Eq. (4.45). In contrast 
with the total energy mc’, the quantity m,c? is called the rest energy, or the 
rest mass energy of the particle. When v « c, the above equation reduces to 


E=K+m ex smv’ +m, (N.R. formula). (4.80) 


Note that E = kinetic energy + mass energy, always,even in the N.R. 
limit, as we shall find in many applications of the energy formula in the 
sequel. 

In Sec. 4.6, we shall present a satisfactory explanation why mc 
looked upon as the total energy of a relativistic particle. 

It is seen from (4.79) that the T-factor represents the ratio between the 
total energy and the rest energy of a particle. Note from Eqs. (4.60) and 
(4.61) that for a non-relativistic particle T = 1, and the total energy is 
almost exclusively its rest energy. For an ultra-relativistic particle (see the 
examples in the following paragraphs), T >> 1. Thus, the I-factor is 
sometimes used as an index of the relativistic “magnitude” of high energy 
particles. 

In Fig. 4.6, we have plotted variation of momentum p and the energy E 
with velocity. 


2 is 


P pp-121212D.fig E 


b 

la 
i 
I 
i 
[i 

„pow. 

i 
i 
' 
i 


fm c 
0 


(j 


D 
. 
a 

t 
is) 
te 
= 
z 
= 
R 
= 


(=) 
te 
b> 
= 
3 
5 
= 
R 


i 

i . l 

1 i i if 

of-- + ----P--- -t----+----f----47-- 
H i i i i 


i Ka 7 me" ' i 


i i 
a A U a AE 
i 1 1 i i 


bn tan nt te t= + 


Sd 


a 
i 


E7 mC? (I +p) 


& 
on 


rest mass 


Sy 
e 
2 
= 
m 
= 
ie 
3 
e 
= 
in 
Ss 
© 
o 


Fig. 4.6. Momentum and energy as functions of velocity. 


The rate of change of I is proportional to the power w delivered by the 
force F, as we can see using Eqs. (4.77) and (4.72). (You can pronounce the 
symbol w as “varpi” — a variant of 71.) 

dw dK dE dl 


w = F . y = =—— = — = — = NOC. (4.81) 
d a dt.” Gt 


The relation F -v= % is valid when the rest mass mọ of the particle 
moving under the force F remains constant. We shall in future encounter a 
situation when this assumption will not be valid. We shall then avoid Eq. 
(4.81). 

As we had remarked following Eq. (4.47) that F and a are not parallel. 
We shall obtain a relation between the two, using the relation (4.81), and the 
definition of force and momentum as given in (4.47) and (4.46): 


p dp d T . rey dv wv (4.82) 
= — = — LVI = a — [— | = , 5 | 2 l 
7 E7 (L m.V) m 7 +v at ma + 72 ( 


The following relationship between the total energy E and the 
momentum p is extremely useful: 


(4.83) 


In the N.R. limit, the above formula approximates to the formula (4.80), as 
the reader can easily verify. 

To prove (4.83), we shall use Eq. (4.79), the identity (4.69), and Eq. 
(4.44). 


E? = mê é = (14+T287)m2e* = mÉ +p ee. (QED) 
3 A z I 


To specialize Eq.(4.83) for a massless particle, like photon, we set m, = 
0 and get the corresponding energy-momentum relation 


E =cp,_ for a photon. (4.84) 


As a simple application of Eq. (4.83), we shall calculate the momenta 
and the velocities of the radioactive particles emitted during the beta-decay 
of a typical gamma-emitter (commonly used in college physics 
laboratories), namely ©°Co. ©°Co emits a beta-particle of kinetic energy 0.31 
MeV, which is followed by two gamma-rays of energy 1.33 MeV and 1.17 
MeV, respectively. The rest energy of the beta-particle (i.e. electron) is 
known to be moc? = 0.51 MeV. Therefore, from (4.79), E = 0.31 + 0.51 = 
0.82 MeV 

Hence, pe = yE? -md = ,/(0.82)? — 0.51)? = 0.64 MeV. 

If one uses MeV/c as a unit of momentum, then p = 0.64 MeV/c. 

To compute the velocity of the emitted electron, we shall first compute 
the T-factor, and use the following relation (which follows from Eq. (4.74): 

vI4-—1 


8 = . (4.85) 


L 


The following estimates are obtained using (4.79) and (4.85): T = -5 = 4# 
= 1.61; p = 0.78. 

The above example shows that the velocity of the beta-particles emitted 
by ©°Co and other radioactive isotopes are fairly relativistic. In the present 
case, the velocity of the emitted beta-particle is v = 0.78c, that is, about 
78% of the speed of light. It also follows from Eq. (4.83) that the momenta 
of the two photons emitted in the radioactive decay of °°Co are 1.33 MeV/c 
and 1.17MeV/c, respectively (because the rest mass of a photon is zero). 


As a second example, we shall consider the beta decay of 34 Cl which 
decays by the emission of a positron of energy 4.5 MeV. As before, E = 
4.50 + 0.51 = 5.01 MeV. 


Hence, pe = y6 01)? — (0.51)? = 4.98 MeV, SO that p = 4.98 MeV/c. 
Also, [ = +4 = 9.8, implying highly relativistic nature of the emitted 


positron. Using Eq. (4.15), one now gets 6 = 0.995. That is, the velocity of 
the emitted particle is 99.5% of the speed of light. 

The above examples serve to illustrate that relativity is not just a 
Utopian idea, far removed from real life encounters. Even undergraduate 
students routinely experiment with electrons and positrons travelling with 
almost the speed of light, in their college laboratories. At the same time it 
should also be remembered that it is not so easy to obtain heavier particles, 
like protons, alpha-particles and heavy ions, at relativistic speeds. The 
following example will illustrate this point. 

Consider an alpha-particle emitted from the radioactive decay of the 
isotope *!*Po. It has a kinetic energy of 8.78 MeV. This is about the highest 
energy with which any particle is emitted from natural radioactivity. The 
rest mass energy of the alpha-particle is moc? = 3727.23 MeV. 

Therefore, T = Time” = 5784972728 — 1.0024. 

Thus, [ = 1, so that the emitted alpha is non-relativistic. Since p « 1, 
we can use the approximation (4.78), and obtain the velocity of the alpha- 
particle approximately as follows: 


The speed of an alpha-particle of energy 8.78 MeV is now seen to be about 
7% of the speed of light. 

Consider protons that have been accelerated to an energy of 1 TeV, 
which is equal to 101? ev. Such particles are ultra-relativistic (T = 1067). 
We shall establish the relativistic nature of a more modest energy proton, 
say, one of energy 10 GeV, i.e. 104 MeV. The rest energy of a proton is moc? 
= 938.247 MeV. Therefore, E = 10938 MeV. One can calculate the 
momentum and velocity using the relations (4.83) and (4.85). The final 
answer is as follows: p = 10.9GeV/c; A = 11.66; p = 0.996. 


4.6. Energy—Momentum Conservation Law 


In Sec. 4.3, we proposed a new definition for momentum, through Eq. 
(4.44), with a promise to provide a general proof later that such a definition 
would ensure Lorentz invariance of the momentum conservation law. (The 
term Lorentz invariance would mean preservation of a certain relationship, 
law or equation after the corresponding quantities have been transformed by 
Lorentz transformation, following a change of the frame of reference.) At 
the time of defining total energy of a particle through Eq. (4.79), we also 
promised to justify the nomenclature “total energy” in a subsequent section. 
We shall now redeem these pledges. Perhaps, the reader will be surprised to 
discover that, unlike in non-relativistic physics, the conservation of 
momentum and energy are not two disjointed concepts, but that one implies 
the other. If we demand that momentum be conserved in all frames of 
reference, that itself will guarantee conservation of total energy, and vice 
versa. 

Before establishing the Lorentz invariance of the conservation laws, it 
will be necessary to establish the Lorentz transformation of energy and 
momentum. We shall, for simplicity, consider boost: ee 0, LAI as 
represented by the configuration shown in Fig. 4.1(a). Let v = # and v’ = = 
represent the velocity of a particle of rest mass mo, as measured foie the 
Lorentz frames S and S’, respectively. Then, according to the defining Eqs. 
(4.44) and (4.79), its momentum and energy in these two frames will be 


p = m,[v, E=mJc?, ins. 
(4.86) 
p'= m.I'v’', E’=m,I"c?, ins, 
where T' and I” represent the Lorentz-factors associated with the velocities 
of the particle in S and S', respectively. It follows from Eq. (4.37) that 


dr 1 dr = dr i et 
v = — = —-—>: > [v = —. (4.87) 
dt Idr dr 


Hence 


dr dr dr dr 

; dr’ dx! dy’ dz' E' „de 

pP = m, — = m, | —, —, — |, — = m, =“. 
° dT °\ dr’ dr’ dr a ° dr 


Using the Lorentz transformation equations (4.3) for the coordinate 
differentials, the momentum and the total energy in the frame S' are now 
expressed as a linear combination of both of these quantities in S (for 


convenience we shall use E/c, instead of E. Note that E/c has the same 
dimension as momentum p). 


F m da’ i dx cdt , E 
= M, = = M, | ~— Po ] = Y | Po- Pl |, 
Pa ° dr dr dr I C 


dr (F dy dz ) E cdt 
p= mo =M, ; = . 


(4.88) 


(4.89) 
E' edt’ cdt Z) E i ) 
— = m, —— = m,y | — — 5— | = yl — — Op, |. 
C dr dr dr C 
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m ~> CK 
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O D 
m, 
VS 
Fig. 4.7. Collision of two particles. 
The inverse of the above transformation follows without much ado: 
E' 
Ds = Y Q + s£) À 
r 
Py = P 
F (4.90) 
Ps = Py, 


A comparison with Eq. (3.8) shows that the set (E/c, Px, Py, pz) transforms, 
under a change of the frame of reference, in the same way as the set (ct, x, 
y, Z). 

With the above preparation we now reconsider the collision 
phenomenon, earlier discussed in Sec. 2.2.2, once again. Figure 4.7 (which 
is a reproduction of Fig. 2.2) describes the setting. Two particles A and B 
come and collide and result in the emergence of two other particles C and D 
from the scene of the collision: 


A+B->C+D. 


This kind of process occurs frequently in the sub-atomic world, as 
illustrated in the famous example 


TT +p7A+K°, 


in which a pi-meson collides with a proton to create a K-meson and a 
lambda particle. 

As in Sec 2.3, let us assume that the total momentum of the system, as 
measured in the frame S, is the same before the collision as after it. Since 
momentum is a vector, the above statement implies conservation of its x, y 
and z components separately. Using the definition (4.44) of momentum, we 
get the following three equations: 


PAz + PBs = PCr + PDz. 
PAy + PBy = Pcy + Poy; (4.91) 
PAz + PBz = PCz + PDz. 


In the above, pax stands for the x-component of the momentum of the 
particle A, B, etc. Using the transformation equation (4.90), we can now 
write the above momentum components in terms of quantities measured in 
the frame S'. This results in the following equation: 


l E’ E' 
7 (Pa +6 = ) +4 (rs. +P z) 
! ! 
y (pee +0) +> (in +08) 
Cc Cc 


Or 
t , E! 
(Da. + Pee T Pos g Pox ) T p (= + a T - =) =V. 


Since 6 is arbitrary, the above equation separates into the following two 
equations. 
Bi, + Eb = Eb + Eb, | | 
1 t f i \ 1.92 j 
Paz t Paz = Per t Ppr 


We note from Eqs. (4.91) and (4.92) that conservation of the x component 
of linear momentum in the frame S implies conservation of both x 
momentum and total energy in the frame S'. We would have come to the 
same conclusion for the other components of momentum had we started this 
exercise from a general boost. We therefore summarize as follows. 
Conservation of momentum in any one frame implies a wider conservation 
law in all other frames, which brings within its fold not only momentum 
conservation, but also energy conservation, that is, conservation of “total 
energy” as defined by Eq. (4.79). Momentum and energy become 
inseparable from each other in relativity, a fact we shall see more 
transparently in Sec. 12.3. 

We shall rewrite this very important concept in the form of more general 
equations for a fuller comprehension of its significance. Suppose there are n 
particles, which are marked for identification as 1, 2,..., n. They are 
allowed to interact. Following the interaction, they may either retain their 
original identities as particles numbered 1, 2,...,n, or may change into N 
other particles, which may now be marked by numbersn+1,n+2,...,n 
+ N. Considering the more general case of N particlesresulting from the 
interaction (N = n, and the particle marked i + n is identical with the one 
marked i + n in the special case where the particles do not change in the 
interaction process), we can now use the definitions (4.44) and (4.79) to 
rewrite the conservation laws, as implied by Eq. (4.92), in any general 
frame of reference S, as follows: 


n n+N 


> mol ivi = > Mol iVi, (4.93a) 


i=l z—n+1 
n n+N 


` mol jc? = > mol ic?. (4.93b) 


¿=1 z=n+1 


In the above !: = Tir Equations (4.93a) and (4.93b) together represent 
conservation of four quantities namely p,, Py, Pz and E. One lesson from Eq. 
(4.93 b) is that it is I times mass, and not mass alone, which is a conserved 
quantity in relativity. (The word “mass” will always mean rest mass mo, 
unless relativistic mass m is explicitly implied.) Conservation of mass, a 
guiding principle of chemistry, wilts away under the onslaught of relativity: 


n n+N 
X moi Æ X mo; (in general). 
i=1 i=n+1 
Mass before reaction # Mass after reaction (in general). 


The reader should now see the reason for calling E = mc? = I'm,c? the 
total energy in the definition (4.79). It is this 'm,c* which is conserved in 


all natural processes. Since (I — 1)mọc° has already been shown to be the 


kinetic energy, the balance m,c? is supposed to be the potential energy of 


the particle manifested in the form of its rest mass. This rest mass potential 
energy can be converted into kinetic energy, or some other form of energy. 
Nature provides an abundance of such examples. 

Most of the practical examples of rest mass potential energy being 
converted into kinetic energy lie in the domain of nuclear physics. An 
awesome example is the fission of uranium. The *°°U nucleus captures a 
slow neutron and spontaneously breaks up into a number of smaller 
fragments, releasing in this process an enormous amount of energy. This 
fission reaction is utilized not only in the detonation of the atom bomb, but 
also in the extraction of useful energy in most of the nuclear power reactors. 
Even though the products of this reaction are not the same in different 
fissions of the same *°°U isotope, they generally consist of two medium 
weight nuclei (one of them with a mass number between 85 and 104, and 
the other with a mass number between 130 and 149), along with about two 


to three neutrons and about six to seven electrons. One typical such reaction 
is the following: 


51 +n — Mo +” La +7e7 + 2n. 


The right-hand side of the above reaction has less mass than the left- 
hand side, so that mass has not been conserved in the reaction. The energy 
equivalence of the difference between the rest mass before the reaction and 
the rest mass after the reaction is the energy released in a single fission. It 
goes into the kinetic energy of the fragments. When there are some 107° 
nuclear fissions taking place in a fraction of a second, the resulting fission 
fragments will be dashing about with their own shares of the released 
energy. The net effect is then a violent liberation of heat. 

In the table below, we have listed the parents as well as the products of 
the above fission reaction, their rest masses in atomic mass unit, and 
converted the mass difference into energy, so as to provide a numerical 
illustration of how mass is converted into kinetic energy.* We have used the 
conversion factor: 1 a.m.u. x c° = 931.16 MeV. 


| == sd] Particle | Mass (in a.m.u.) | m,c* (in MeV) 
Before | U-235 235.0439 

n 1.0087 

236.0526 219,802.73 
Mo-95 94.9058 
La-139 138.9061 
2n 2 x 1.0087 
Te 7 x 0.00055 

5. 219,598.39 


Difference | 0.2195 204.34 


(0.09% of the original 


mass energy) 


As another example of mass—energy conversion, we shall consider a 
fusion reaction, in which four hydrogen nuclei, i.e. protons, fuse to form a 
helium nucleus: 


4p —> tHe + 2et. 


where e* represents positron, having the same rest mass as an electron We 
have the following table. 


ro | Particle Mass (in a.m.u.) | m,c* (in MeV) 


Before | 4p 4 x 1.0078 
He-4 4.0030 
2 x 0.00055 


T0031 3727.526 
Difference | 0.0281 26.166 
(0.7% of the original 


mass energy ) 


We have provided more examples in the suggested problems at the end 
of this chapter to give the reader a thorougher familiarity with the energy- 
momentum conservation principle. 

If, the examples cited so far have given the reader a wrong notion that 
these principles are applicable only in physics and not in chemistry, then we 
shall consider the example of the hydrogen atom to clear this 
misunderstanding. Actually the mass of a hydrogen atom is less than the 
mass of its constituents, namely, a proton and an electron. However, this 
difference, converted into mass energy, is only 13.6 eV. Compared to the 
rest mass energy of the hydrogen atom, which is about 938.36 MeV, this 
13.6 eV is very negligible, only about 1.45 x 1076% of hydrogen’s rest 
energy. This mass is so negligibly gained, or lost, in every chemical 
reaction, that a chemist does not need to pay any attention to it. However, it 
will be useful to remember that when a candle burns, or ice melts, these 
processes are accompanied by a very small change in the masses. The end 
products are, lighter in the first example and heavier in the second one, 
although the differences are so tiny that it will be very difficult to measure 
them. 


4.7. The Centre of Mass of a System of Particles: The Zero 
Momentum Frame 


The centre of mass (C.M.) of a system of particles plays an important role 
in non-relativistic classical mechanics,” especially in a many-body system, 


like the motion of a rigid body (which is made up of a very large number of 
atomic particles), and in two-body systems. like the Earth-Moon pair 
moving in the gravitational field of the Sun. 

The CM frame of such a system is the inertial frame in which the CM is 
stationary, even though the individual particles may be moving with 
arbitrary velocities. Also the number of the constituent particles, and their 
velocities, may differ following a reaction or a collision, and yet the CM of 
the system will move on with a constant, unchanged velocity. 

The relativistic counterpart of the classical CM is defined along parallel 
lines, except that the invariant mass of each constituent particle is now 
replaced by the relativistic mass of such particles. © 

Let us consider system of N particles, having rest masses, radius vectors 
and velocities {mpoj, ri, vj; i = 1, 2, 3, ..., N}, with respect to an inertial 
frame S. We define the location rem and the velocity Vem of the CM as 
follows: 


eA . . 
lem j ; \ 4.94a } 
pee m 
N - 
drem 5-9 MiVi P : 
Vem = F =e =s Via (4.94b) 
€ t Dn Mli = 
where, m; = [j;mo; = the relativistic mass of the particle 7. (4.94c) 
N 
P = X miv; = Total momentum of the system, (4.94d) 


2—0 
N 
M= > m; = Total relativistic mass of the system. (4.94e) 


2—0 


For zero mass particles, like photons of light frequency v, m; = hv/c?. 

Since P and E are each conserved for an isolated system, these two 
values should remain unchanged following a chemical or nuclear reaction. 
Therefore, it follows from Eq. (4.94b) that the velocity Vem should remain 
unchanged in such a reaction. In order to clarify the concept of CM, and its 
uniform motion in the relativistic case, we have worked out a few examples 
in Problem 4.7. 


The zero momentum (ZM) frame is the relativistic analogue of the CM 
frame of N.R. mechanics. It is the inertial frame in which P = 0. 

To make a transition from the Lab frame to the ZM frame one has to 
make a Lorentz transformation of the energy-momentum 4-vector. We have 
taken up this topic seriously in Sec. 8.8.1. 


4.8. The Twin Paradox 


An example which is often cited for illustrating the relative aspect of time is 
the Twin Paradox. It is a riddle around the ageing rates between the brother 
who stays home and his counterpart who undertakes a space voyage. But 
then, why should there be any ageing difference, given that all inertial 
frames are equal? The clue lies in recognizing that one of the brothers is 
necessarily in an accelerating frame. 

There are several versions of this paradox (see References) most of 
which allude to a very brief but violent acceleration at the beginning, at the 
turnaround point, and at the end of the journey. This tends to take away 
attention from the accelerating aspect which is the crux of the riddle. 
Therefore, we shall choose a version of the paradox (see J D Jackson 
Problem 11.4) in which the acceleration is moderate but steady and lasts 
through the entire duration of the voyage. 

Ram and Sam are twins who have been living in an inertial space lab S 
(shown as inertial frame S in Fig. 4.8) since childhood. Ram being the more 
adventurous of the two, decided to take a space odyssey when they were 20 
years old. He boards a rocket R (shown as frame R) which accelerates at a 
constant rate a with respect to a comoving Lorentz frame R. After a time 
span of n years, as recorded on his watch, Ram, wanting to come back, 
starts decelerating at the same rate a for another n years to reach zero speed 
with respect to S. Then he turns around, accelerates homeward at the same 
rate a for n years, then decelerates for another n years and safely lands at 
their old Space Lab to reunite with his brother Sam. Ram is now 20 + 4n 
years old. How old is Sam? 

Let us first set up the basic differential equation required for finding the 
answer. 

Let P and Q be two adjacent points along the path of the rocket at 
distances x and x + dx from the starting point O of Ram’s journey. Let v and 
v + dv represent the velocities of Ram with respect to S while passing P 


(“event P”) and Q (“event Q”), respectively. Let Ro(@p) represent the co- 
moving Lorentz Frame at the event “@p”. We shall assign the following 
coordinates to the above pair of events with respect to the three frames of 
reference. 


(ct.2) inns 


“Op” = ¢ (ct’,0) in Ro(Op) >. (4.95) 
(er,0) inR 
(ct+edt.r+dr) inS 

“Og” = ¢ (ct’ + cdt', dr’) in Ro(Op) >. (4.96) 
(cr + cdr,0) in R 


Noting that the frame R is moving with infinitesimal speed at the event 
“@qg” relative to the frame Ro(@p) it follows that dt = dt’. Therefore, from 


Eq. (4.17a) 


—— = adr, (4.97) 


ES t TdT < Rocket time 


Lab time 


t tdt < 


Fig. 4.8. Sam’s frame S and Ram’s frame R. 


Fig. 4.9. Round trip route of Ram. 


In Fig. 4.9 we have shown the path followed by Ram. We have 
demarcated his route into four segments namely, outward acceleration along 
OA, deceleration along AB, turning around at B, homeward acceleration 
along BC, followed by deceleration along CD, and finally return home at D. 

Let us consider the first segment of his journey, namely to O to A. Let t 
and t be the times measured in the frames S and R, respectively, 
corresponding to the event that “Ram has travelled a distance x from O” on 
this segment of his tour. The velocity of Ram at this point is obtained by 
integrating Eq. (4.97). 


py dy d a 
— =a dr, (4.98) 
I 0 


o l-ł 
aT v 
or tanh—= =. (4.99) 
C m 
Noting that 
dt = y dr, (4.100) 
where 
1 l aT 
y = — = — = cosh —, (4.101) 
y l1 — = y1 — tanh“ — i 
dx 1 dx 
so that v=—=-—., (4.102) 
dt ~ dT 
dx aT aT _. aT 
we get — = ctanh — cosh — =csinh—. (4.103) 
dT C C c 


Integrating again, 


eT 2 


aT C aT 
p= c | sinh —dr = — [cosh — — i (4.104) 
F ; 


To find t we integrate Eq. (4.100) and get 


rt t aT C. aT , 
t = y dT = cosh —dr = = sinh —. (4.105) 
0 0 C a G 


Let the time recorded on the clock of Ram at the instant when he has 
reached A be tọ. Then the time and space coordinates (ct,, x4) of the event 
“Ram has reached A”, with respect to the frame S, are obtained by replacing 
T by To in Eqs. (4.104) and (4.105). 
aTo 


ta = — [cosh = = 1| ; (4.106a) 


( 


Cc aTh 
ta = = sinh —. (4.106b) 
a c 


The velocity of Ram at this point follows from Eqs. (4.99): 


aTo r Pare 
va = ctanh—. (4.107) 
T 


The equation of motion for the second leg, i.e. AB, is obtained by 
replacing a by —a in (4.97). Integrating we get 


” dv i 
— = -4 | dr. (4.108) 
TA 


vA l — — 
Now making use of Eq. (4.107), we get 
al 2T —T) \ 
v = ctanh _— g TETE 2To, (4.109) 
7 


To S T < 27. (4.110) 


a(2T — T) 
y = cosh | —— |], 
e 


To find the corresponding time measured in S, we integrate as in Eq. 


(4.105) 
t = 
f dt =| ydr (4.111) 
ta TA 


and get 


IA 
4 
IA 


~ 2To. (4.112) 


p C. al2To —T) 
t = 2ta — —sinh | ——— |. To 
a C 


To obtain x, we proceed similarly and get: 


a 


ex n 
| dx = | yu dr, 
TA TO 
e aTo a(2T>) —T) 
or £= — 2 cosh ( ) — cosh | —————_ | - 1] , 
a C C 


To ST < 27. (4.113) 


Setting T = Tọ in Eqs. (4.109), (4.110) and (4.111), we get the 
coordinates and the velocity of Ram at B: 


tg = 2t4, tp = 224, Up =U. (4.114) 
The return journey is a mirror reflection of the outward journey. Hence, 


the duration and the furthest distance of Ram’s space odyssey, as measured 
by Sam are as follows: 


4e To 
T = = sinh (—). 
a G 


32 (4.115) 
4C” a To 
) = — OS —— — 
E a [e h ( C ) 1 l 
In contrast, the travel time, as measured by Ram himself, is 
To = AT. (4.116) 


For a numerical feel of the problem, the reader may consider a to be equal 
to the acceleration due to gravity near the surface of the earth, and take To to 


be equal to 5 years, so that Ram returns home at the age of 40. This will 
then lead to the following estimates: 


D=16x10"%km!! T = 335years!!!! 


4.9. Compton Scattering 


As an illustration of the energy-momentum conservation laws, we shall cite 
one example of historical significance in atomic physics. It is the scattering 
of X-rays by atomic electrons resulting in a shift in the wavelength of the 
scattered X-rays — an effect known as Compton Scattering. Since the outer 
electrons (with binding energy of a few electron volts) of the atom are 
chiefly responsible for Compton scattering and since the X-ray photons 
have comparatively higher energies (Kg X-ray photons from Tungsten have 
energy of 69.83keV), these electrons are assumed to be at rest (i.e. zero 
energy) before collision — an assumption that simplifies calculations 
greatly. Figure 4.10 shows the scattering mechanism schematically. The 
incoming photon is scattered at an angle —0 with the direction of incidence 
(which is taken to be along the X-axis), after hitting an electron, which is 
knocked off at an angle of @. 
The energy of a photon of frequency v is given as 


E(v) = hv, (4.117) 


where h is Planck’s constant. The magnitude of the momentum of the same 
photon (whose rest mass is zero) follows from Eq. (4.84) to be 


hv 


plv) = —. (4.118) 
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Fig. 4.10. Compton scattering. Schematic diagram. 


Let us first write down the momentum conservation equation. The 
momentum of the photon, before and after the collision, is given by the 


vectors pı and p> respectively, whose magnitudes are, according to Eq. 
(4.118), pı = 4, pə = “2, where vj, vo are the frequencies of the X-ray 
photon before and after the scattering. If pe is the momentum of the 
recoiling electron, then the momentum conservation statement is that 


D1 = D2 + Da, 
ee ne (4.119) 
or Pe = Pi — Pz. 


Squaring either side, 


pe = pi + p2 — 2pı p2 cos ĝ. (4.120) 


Now we shall obtain the energy conservation equation. Letting the rest 
mass of the electron to be mp, and using Eq. (4.78), we get the initial and 


the final energy of the system as 


2 
E;nitial = hv T f SEn = Pı + MoC , 


9 9 9 
Egnal = hv T E. moving = pec + m2c4 +. Pec. 


The energy conservation therefore means 


2 noe .2\2 92 2 
Pic + MC = pect y (mM.c*)* + perce. 


Simplifying we get 
p? = (Pi — p2)? +2 (pı — P2) me. (4.121) 
From (4.120) and (4.121): 
(pı — p2 742 (pı — P2) M.C = pi + pa — 2pip2 cos 8, (4.122a) 
or (py — po) M,C = Pype(1 — cos@), (4.122b) 
h\? l l 
or Alr — ram, = | — | virall — cos), (4.122c) 
P 


} i 
or e(+- ~) = —_*_(1— cosĝ). (4.122d) 


MoC 


From Eq. (4.122c), we get the change in the gamma ray energy: 


hra hv 
hvy — hve = ——(1 — cos). (4.123) 


m,c* 


From Eq. (4.122d), we get the change in wavelength, using the fact that 
Àv =C: 


h 


AÀ = à2 — À1 = 


(1 — cosĝ), (4.124) 
mC 


where 1, A> represent the wavelengths of the incident and the scattered X- 
rays. 

The factor + appearing in the above equation is called the Compton 
wavelength and is denoted by the symbol A,. Its value can be computed: 


h 6.63 x 10-4 i 
Ae = = = = 243 x 107 < a = 0.0243 A. 
m,e (9.11 x 10-31) x (3 x 108) 


(4.125) 


The kinetic energy T, transferred to the ejected electron is the same as 


the change in the photon energy. To find this energy we first solve Eq. 
(4.123) for hv»: 


mc? hri 
hins == aes, ee ee, O) 


= +(1—c sf ) 1 (==) hr 


(1 — cos@)hiy 


Therefore T, = hr — hv = (4.126b) 


— + (1 —cos@) 


We shall now obtain a relation between the scattering angle @ and the 


electron ejection angle @, from the momentum conservation equation 
(4.119): 


pı = pii, p2 = p2({cosĝi = sin 8j ), 


Pe = p.(cos ġi + sin Oj), (4.127a) 
Pe COS@ = py — pocos@ (conservation of r-momentum), (4.127b) 
Pe sinġ = pasin (conservation of y-momentum). (4.127c) 


Pi / pe — cos@ hr / hra — cos 
cot ò = —— > m 
sin @ sin ð 


7 (: +) (=) (4.127d) 
m, C$ sin f 


To obtain Eq. (4.127d) we used Eq. (4.126a), and made some algebraic 
manipulations. 

We have plotted T, vs. hv; and @ vs. @ in Fig. 4.11. It is seen from Fig. 
4.11(b) that the electron is deflected by the angle 7/2 at 6 = 0, whereas 
common sense would suggest that for a direct hit, the target particle should 
recoil in the same direction. It is now seen from Fig. 4.11(a), and also 
evident from Eq. (4.126b), that the electron’s share of the incident y-ray 
energy goes to zero as 8 > 0. For a zero energy particle, the deflection 
angle can be anything. However, at back scattering of the y-ray, i.e. at 0 = 7, 
the deflection angle is zero, which is what is expected. 


“0 02 a4 Of Q8 1 1:2 14 L6 


is 2 he 
bie hv, (MeV) (b) 
(a) 


Fig. 4.11. Compton scattering. (a) Electron energy vs. incident photon energy for four values of 


photon scattering angle (indicated by the side of each graph). (b) Electron ejection angle vs. photon 
scattering angle for four values of photon energy in MeV (indicated by the side of each graph). 


4.10. Summary of Important Formulas 


Lorentz factor for boost. See Eq. (2.20). 


= u 
y E -—=;, where 8 = —. 


1 — 84 € 


Lorentz transformation from S to S'. See Eq. (3.8). 


ct’ = y(ct — Br), 


xr = ylz — Bet), 


; Vv, —U , Uy , U, 
a PT "=a Yat) ~ 7 (1— 282) 
c~“ j - c 
Other important formulas. 
: 1 a 
Dynamic Lorentz factor r= Vin he 
1— 8? 
Momentum p = I mov, 
mo 
Relativistic mass m = T mo = : = 
v4 
l1- = 
I og 
dp 
Equation of motion = — 
L 


Kinetic energy 


Total energy 


K = (T — 1)moce’. 


E = K + moc? = Tm = me, 


Relation between p and E E? = ?p? + met. 


4.11. Worked Out Problems II 


Problem 4.1. Aparticle P decays into two smaller particles Dı and D3. 


P > Dı + Də 


(a famous example being A > p + m ; also see the example suggested in 
Example 4.5). Let M be the mass of the parent particle and let m;, m2 be the 
masses of the daughters (“mass” will mean “rest mass”, unless otherwise 
stated). Energy consideration requires that M > m; + mp, so that 


Q =(M -m — m) (4.128) 


is the energy liberated in the reaction in the rest frame of P (or the 
disintegration energy). 


(a) Let pı and E} represent the momentum and energy of D4 in the rest 


frame of P. Using energy and momentum conservation laws and Eq. (4.78), 
show that 


2M 


M m-mê), 
E, = {> + “| 2. (4.130) 


(b) Note that pı = po. However, Eq. (4.129) does not exhibit explicit 
symmetry between mı and mp. Therefore, recast Eq. (4.129) into the 
following symmetric form: 


: (M? + m3 —m?)c)- 393 
pi = | EO mec’, (4.129) 


V {M2 — (m, + m2)? }{M? — (mı — m2)?} 


pı — p2 = JM Cc. (4.131) 


(c) Hence, show that 
Q=T,+ To, (4.132) 


where T;, T> are the kinetic energies of the daughters D1, D>, respectively, 
in the rest frame of P. 


(d) Use the above equations to the observed decay of pi-meson (7°) to mu- 
meson (4*) and neutrino (v,). Take the masses, in unit of MeV/c? as 


follows: My = 139.6; m, = 105.7; m, = 0. Calculate the kinetic energies of 
u* and v,, in the rest frame of mr”. 


References: Jackson [13], Griffiths [14] 


Solution. (a) Use energy-momentum conservation in the rest frame of P: 
By + Eə = Me, pı + pe = 0. (a) 
Hence, E = Mc? — Ea, (b) 
or of pee? + md = Me — ye + mat 
= Mc? — y pic? + mid. (c) 


e ; ees > (M? + m2 — m?)c? . 
Squaring, simplifying, ,/ p c2 + mci SS ——— 7 M tidig (d) 
(M? + m3 — m?)c l 
or p= = {2 — mac’. (e) 
1 2 O M 2 | 


Also, E? = p?c? + mic4 


M? > 
ja tr bbe Del Ê — mét + mic nA 


9 


= A ie . (£) 
Hence. E N m2 — m3 — mī (g) 
ence, = 
i a g) 
Similarly. E> = ] m? — m3 4 
imilariy, ky = an C. (h) 


(b) By algebraic manipulation, we rewrite (e) as follows: 


to 


c* {M? — (m, + m2)?} {M? — (m, — m2)?} 


4M? a 


~ 
< 
bo 


showing symmetry between m, and mp. 


(c) Q = M — mı — mM = mass excess. 
foes 


Tı = Ei — mc? = M? + m2 — m?) — mic?, 
2M a 1 


2M : ; , , ; ; 
or ——T; = M? + m — m3 — 2M mı = (M — mı}? — m3 
c2 


= (M — m —m2)\( M — mı + m2) = Q(2M — 2mı — Q), 
Q(2M —2mı - Q) > 


or Ti = 2M 
i Q(2M —2m2-Q) . 
Similarly, Tə = E i 
Q(2M +2(M—m,—m2.-Q), -3 
Adding, Ti + T2 = — u Te = Qe. 


(da) M =139.5MeV/c?; mı = 105.7MeV/c?; mə=0MeV/č; 


Q = 139.7 — 105.7 = 33.9 MeV/c?. 


Q(2M —2m,-—Q), Q? , 33.92 
= A a? = ae? = = 411 MEV. 
: 2M “ “3M ~ 2x 139.6 
Q(2M = 2m2 = Q) 2 Q(2M — Q) 9 
Ta = i = 1l 
2M 2M 


33.9 x (2 x 139.6 — 33.9) _ 9 78 MeV 
= Ã— = COZY Mev. 
2 x 139.6 
Problem 4.2. A particle is moving with velocity v along the XY-plane, 


making an angle 0 with the X-axis, as seen from the frame S. 


(i) Find the magnitude of the velocity of this particle in the frame S' 
under the boost:S(c6, 0, 0)S’. 

(ii) Show that the angle 6’ of this velocity with the X’-axis is given by the 
angle transformation formula: 


v sin 


tan 8’ = Å—. (4.133) 
yiv cosh — 3) 


(iii) Specialize to the case of a photon, and show that under the above 
boost, the magnitude of the velocity does not change. However, the 
angle of inclination changes to 


sin f 


tang = ——. 
y(cos@ — 8) 


(4.134) 


(iv) Modify the above equation to obtain the aberration angle @' of a 
starlight which comes at an angle @ in the Sun’s frame (frame of the 
“fixed stars”). See Fig. 4.12. Show that 


y(sin@+ 8) 
tang = ————. (4.135) 
COS o 
(sin @ + 8) a 
tang’ = ————. (4.135) 
COS © 


Hint: Set 6 = —(/2+ $). 


Solution. The velocity of the particle in the frame S is v = v (cos 0 e, + sin 0 
e,). Let v' = v'(cos 6’ ex + sin 6’ e,) be its velocity in the frame S'. 


in unit of T6 


Q 


Ly 


a 


= —(m/2+h) (a) 


2 
in unit of W6 


(b) 


Fig. 4.12. Aberration of starlight. (a) The angles @, @ made by the starlight in the Sun’s frame S 
and in the frame S' moving with speed c. with respect to the Sun. (b) Plot of @' vs. @ for values of B 
= 0, 0.1, 0.2,..., 0.9. 


Setting v = cv, and using velocity addition formula (4.23), 


ycosé— 8 


v’ cos’ = _—_—_—_—_—_—_—_———_—_———. (a) 
1 — Bv coso 
py! sin 8 = ee. (b) 
v(1 — Sy cos@) 
. . 2 i 2 
P v cosl- 8 v sin ø PA 
v? = | 1 H a (€) 
1 — Bv cos ð y{1 — prv cosh) 
8?(1— v?) P 
= 1 + ——. (d) 
(1 — $v cos#)- 
ang vsin f) 
an eV... (I) 
y({v cosh — p) 


Answers to parts (i) and (ii) are given in Eqs. (d) and (f). That to part (iii) is 
obtained by setting v = 1. 

We have explained the light aberration problem in Fig. 4.12(a). In Fig. 
4.12(b), we have plotted @' vs. @, making tick marks on the axes at intervals 
of 7/6. 

Earth moves along its orbit with a speed of 30 km/s, which means fp = 
10+. This makes the @' ~ @. In actual terms, $’ ~ B = 10 * radian, for @ = 
0, which is approximately equal to 20 arcseconds, when the star is at the 
zenith. 


Problem 4.3. Let one of the daughters of Problem 4.1, say D>, be a particle 
of zero rest mass (e.g. photon, neutrino). (One typical such reaction is the 
transition of an atom or nuclei from an excited state to the ground level, or 
any lower level, accompanied by the emission of a photon.) Show that for 
this case 

pic = me (4.136a) 


E; = wen) 2 (4.136b) 
Solution. Use Eq. (4.130). Set m, = 0. 


Problem 4.4. Let an atom (or a nucleus) make a radiative transition (i.e. the 
process is accompanied by the emission of a photon) from an excited state 


of energy € (measured from the ground state) to the ground state. Let the 
mass of the atom (or the nucleus) at its ground state be Mp. Using Eq. 


(4.130), or starting from energy-momentum conservation equations, show 
that the frequency v of the emitted photon is given by the relation 


hv =e}1l— P 9 (4.137) 
2(€ + Moc? ) 


where h is Planck’s constant. 


Solution. Using Eq. (4.130), setting m, = 0, E4 = hv, M = Mp + ée/c? m = 
Mp; we get the answer. 


Problem 4.5. An object of mass mı and moving with momentum p4 
collides and coalesces with a stationary object of mass mọ. Show that the 
mass M of the compound object thus formed is given by the relation: 


9 9 99 99 ~ f| 9. ‘ f ao) 
M*c* = mč + mc“ + 2m24/ pec? +m. (4.138) 


Solution. Let E4, E> represent the energies of the objects before collision. 


Let P, E, M represent the momentum, energy and mass of the coalesced 
object (after collision). Using Eq. (4.83), we get 


E, = ymi + pi; E= m2C?. 
Energy Conservation: E = E1 + E2 = VY MÉ + pic? + mac? 
Momentum Conservation: P = p4. 
Using Eq. (4.83) again: M?c? = B?/c? — p? 
= mi + mc? + 2m2 y pic + mic. 


Problem 4.6. In a {p + ‘Li > Be* > a+ a} reaction, protons of energy 
3.00 MeV fuse with a stationary ”Li nucleus to form a compound nucleus 
8Be*, which is the nucleus of ®Be in an excited state. The compound 
nucleus has a short life time ~107t6 s. It breaks up into two a particles, 
moving in opposite directions.° 


(a) Determine the mass, momentum and recoil velocity of the compound 
nucleus in the Lab frame. Is this velocity relativistic? 

(b) What is the energy level of the compound nucleus (i.e. the energy of 
the nucleus, in MeV, above the ground level of 8Be)? 

(c) Find the kinetic energy, momentum and velocity of each a particle in 
the rest frame of the compound nucleus. 


Solution. 


Answers to Part (a). We have explained the nuclear reaction in Fig. 4.13 
In the following, we shall use the symbol “u” to mean atomic mass unit. 


.----—---—---—---—---—--5 


pipe 
3r a 


Fig. 4.13 Energy conservation in the nuclear reaction: {p + at 


reaction {p + fii > BB ex}: (b) the reaction (oBex > ata}. 


Be* — a + q} reaction. (a) the 


We shall use Eq. (4.138) in which particle # 1 will represent the proton, 
particle # 2, the Lithium nucleus ”Li. The energy of the incident particle, 
i.e. the proton, is non-relativistic, being only 3 MeV compared the rest 
energy of the proton which is * 1u.c? = 931.48 MeV. 

We shall write (m, p) for the mass and the momentum of the proton, M 
for the mass of the target “Li nucleus, m* for the mass of the compound 


nucleus ®Be* in the resulting excited state, # for the mass of the same 
nucleus in its ground state, and set 


6M =a" —(M+m). (4.139) 


Equation (4.138) now becomes 


M? =m? + M?+2M (2 T +m? 
c 


=(m+ M}? —2Mm+2M (2 i +m?, 


or [m +(m+ M) m" -— (m +M) +2mM 


=2M ey +m?, 


or, [2(M+m)+6M]6M +2mM =2M (2) +m?. 


Let K represent the K.E. of the proton. Then, in this N.R. case 


K = £. = (8 = ans 
C 
-= 2K 
However, (=) a <1. 


Hence, 


ny 2 m2 ly; 2 
(=) +m? =m/1t+ (=) xm (: += (— p ) ) . 
c me 2 \me 


Therefore, from (4.140d) 


[2(M + m)+6M]6M +2mM = 2M (: 


p2 
or, 2(M +m)ôM +2mM x2Mm (1+ +2 (=) 


or, 2(M +m)ôM =Mm (2y ; 


me 


; _ Mm p\? Mm 2K 
ey = 2(M +m) E ~ 2(M +m) (5) 


7 Mm K 
~ M+m\me/)° 


Now we collect some mass values from [16]. 


(4.140a) 


(4.140b) 


(4.140c) 


(4.140d) 


(4.141) 


(4.142) 


(4.143a) 


(4.143b) 


(4.143c) 


(4.143d) 


u = l atomic mass unit = 931.481 MeV IE, (4.144a) 


lu x c* = 931.481 MeV, (4.144b) 
> = á j 

1 MeV = —— = 1.07356 x 107? x c? u.(m/s}? (4.144c) 

931.481 i 

= 9.662 x 10! u.(m/s)?, (4.144d) 

m = m(proton) = 1.00782 u, (4.144e) 

M = m(‘Li) = 7.016004 u, (4.144f) 

M= m(ËBe) = 8.005305 u, (4.144¢) 

u = m(a) = 4.0026 u. (4.144h) 


Go back to (4.143d), taking K = 3 MeV as given: 


K 3 niin «eel 
—; = — = 3.1957 x 10, 
mec 1.00782 x 931.481 


Mm 7.016004 x 1.00782 7.0709 (4.145) 
= a = — = 2.88124 0, ee 
M+m 7.016004+ 1.00782 8.02382 


M = 0.88124 x 3.1957 x 107? = 2.8162 x 1077 u. 


Therefore, from (4.139), mass of the compound nucleus ĉBe*: 


f* = (M + m)+6M = 8.02382 + 2.8162 x 10-3 = 8.02662u. (4.146) 


Let us write (E,K,P,¥) for the (total energy, kinetic energy, momentum, 
recoil velocity) of "Be*, following the reaction. Since the energy of this 
nuclide is in the N.R. domain, we can write E = K+ a1*c?. See (4.80) and the 
comment following (4.83). Due to conservation of energy: 


K(p)+ (M+ m)c? = K+ Me, 
or K = K(p) +{(M+m)- fA" bc? = K (p) -ôM e 
= 3 — 2.8162 x 10-3 x 931.481 = 3 — 2.623 = 0.377 MeV. 
= 0.377 x 9.662 x 10! u.(m/s)? = 3.6426 x 10! u.(m/s)? 
fused (4.144 d)], (4.147) 
P = \/2m* K = V2 x 8.02662 x 3.6426 x 1013 
= 75.8475 x 107 = 2.418 x 107 um/s, 


P 2418 
mM* 8.02662 


C 


y = x10 %3x10f m/s= —. 
100 


The domain of relativistic dynamics starts at ~ c/4. See Example 1 in 
Sec. 9.7. Hence the recoil velocity V is non-relativistic. 


Answer to Part (b). We shall find the difference between the mass m* of 
8Be* and the mass 4 of “Be. 
ôM =M" — (M +m) = 2.8162 x 1073 u, 
AM =(M + m) — M = (7.016004 + 1.00782) — 8.005305 
= 18.519 x 1079 u, 
(m* — M) = (M + AM) = (2.8162 + 18.519) x 1073 
= 21.3352 x 107ĉ°u, 


(m* — M)? = 21.3352 x 107? x 931.481 = 19.87 MeV, (4.148) 


which is the energy level of the compound nucleus ®Be* above the ground 
state of Be. This result concurs with the value of 19.9 MeV found in [15, p. 
203]. 


Answer to Part (c). Let us compare the mass of two a-particles equal to 2p 
with the mass of ĉBe, equal to M. 


= —0.0931 MeV/c?. 


Compare this with the energy level (—0.096 MeV) for 2a shown in [15, 
p. 203]. The small discrepancy can be due to error margins of the mass 
values taken from [16]. 

Let us follow up by energy conservation of the decaying nucleus °Be* 
in its rest frame. Let k(q@) stand for the kinetic energy of each «æ particle 
(they are equal, as they move apart in opposite directions with equal and 
opposite momenta). 


Me? = 2( uc? + klaj), 


or, 8.0266 = 2 x (4.0026 + kla) le ), from (4.146) and (4.144h), 


i 1 E , = 
or, k(a)/c? = 5x (8.0266 — 8.0052) = 0.0214/2 = 0.0107 u, 
~ (4.149) 
wp 5 >e j 1 2 
or, kla) = 0.0107 x9x 10! = 9.63 x 1014 u.(m/s)* 
9.63 x 1014 = f 
= ——— = 9.97 MeV [we used (4.144d)] 
9.662 x 1013 


Let us compare k(a@) with uc?. From (4.144d) and (4.144f), uc? = 
4.0026u.c? = 4.0026 x 931.481 = 3728MeV. Thus, k(a) « uc°, and the N.R 
approximation is valid. See the comment following (4.147). Therefore, we 
can write 


2 x 4.0026 x 9.63 x 1014 = 8.78 x 10‘ u.m/s 


- (A 150) 

pla) 8.78 x 10‘ u.m/s ; ins 2 (4.150) 

vla) = =— = —— = 2.19 x 10 m/s 30° 
30 


H 4.0026 u 


pla) = y 2uk(a) = v 


The a particles emerging from the decay of ®Be* are far below the 
relativistic domain. 


Problem 4.7. In this exercise, we shall compute the velocity of the zero 
momentum frame (Sec. 4.7) using three examples. 


(a) Consider the first part of the reaction given in Problem 4.5, i.e. {p + ‘Li 
> 8Be*}. Find the momentum and the velocity of p impinging on the 
7Li nucleus with kinetic energy 3 MeV. 

(b) Find the velocity of the CM before and after the reaction, in the Lab 
frame. 


(c) Consider the second part of the reaction given in Problem 4.6, i.e. 
{8Be* > a + a}. Find the velocity of the CM before and after the 
reaction, in the ZM frame. 


oe 3) 


Answer to Part (a). In the following equation, the subscript “,” represents 
proton. 


Dp = \/2m,K, = V2 x 1.00782 x 3 x 9.662 x 1013 


2.417 x 10° u.m/s, 


Pp 2.417 ~ n 1 
= = — x 10' = 2.398 x 10° m/s = —c. (4.151) 
Mp 1.00782 10 


I 


Answer to Part (b). Let us first find the relativistic masses of p, “Li, ®Be* in 
the atomic mass unit u. 


r= ! = | — 1.01: 
rT)? Vvi-@iy 


ye, 
r(*Li) = 1.T(?Be*) = 1. 


‘| Mp = 1.01 x 1.00782 = 1.0179; m(*Li) 


= M = 7.016004; m(®Be*) = @i* = 8.02662. (4.152) 
Pp + Pui 2.417 x 107 +0 
‘e: Umber 0000 Oe 
abe Mp + ML 1.016004 + 7.01604 
me c 
=03x10' m/s z —. 
100 
P : 
Vem.after = —r = = See Eq. (4.147). 
Ml 100 


Answer to Part (c). This is too trivial. The momentum of the parent nucleus 
is zero before the decay. The momenta of the daughter a-particles are equal 
and opposite. Hence P = 0 before and after the decay process. Hence, Vem is 


zero always. 


4.12. Illustrative Numerical Examples II 


Example 4.1. We observe two galaxies A and B moving in opposite 
directions with speeds 0.5c and 0.4c, respectively. What is the velocity of B 
as seen from galaxy A? 


Solution. The specified velocities va = —0.5c, vg = 0.4c are with respect to 


“our” frame S. Let S' be the frame of the galaxy A. Then the required 
velocity vg of B is obtained by applying the velocity addition formula (4.6). 


; 0.4 — (—0.5) = 
ve = —_—_———c = 0..75c. 


1 — (0.4)(—0.5) 


Example 4.2. In the Lab frame a 7° is moving in the positive X-direction 
with velocity 0.8c and a a is moving in the negative X-direction with 
velocity 0.9c. Find (a) the velocity of m in the rest frame of 7t, (b) the 
velocity of m” in the rest frame of 7. 


Solution. This is similar to the previous problem. The given velocities v,4 = 
0.8c, Vr- = —0.9c are with respect to “our” frame S. Let us represent the 


rest frames of n“ and m7 as S', and S”, respectively. We get the answers as 
follows. 


—0.9 — 0.8 
= =——— l = — 0.988. 
1 — (—0.9)(0.8) 


0.8 — (—0.9 
j \ ut \ 
(b) Urt 


— e = 0).988&e. 
1 — (0.8)(—0.9) 


Example 4.3. A rod of proper length Lọ = 2 m is moving in the +X- 
direction with velocity 0.6c. An electron is moving in the —X-direction with 
velocity 0.8c. Find the time T that the electron takes to traverse the length 
of the rod, as seen from the Lab frame S? Answer the question in the 
following three ways and see that the answers agree. 


(a) View the motion from the frame S. 

(b) View the motion from the electron’s rest frame E, and convert the time 
thus obtained into Lab time T. (Hint: use velocity addition formula and 
Rules 2 and 3) 

(c) View the motion from the rest frame of the meter-stick and convert 


back to Lab time T. (Hint: Use velocity addition formula and LT.) 


Solution, (a) Let Vap be the velocity with which the rod and the electron are 
approaching each other, as seen from the Lab frame S, and let L be the 
length of the rod in frame S which is related to the proper length Lp by the 
length contraction formula (2.26). We need to calculate the Lorentz factor y 
for transformation from the Lab frame to the rest frame of the rod. 


Vap = 0.6c + 0.8c = 1.4e, 
l l ji 5 
 f/1-0.62 08 4’ 
Le 
L= => = 2 x 0.8 = 1.6m, 
6 
T = L _16___ 16 _ 0.38 x 1078s. 


6 
Vap l4de 14x3x 108 


(b) We shall first find the time t of traversing the length of the rod in the 
electron’s rest frame E. For this we need to know the velocity c/s’ and the 
length í' of the rod in the frame E. We apply the velocity addition formula to 
get the first answer, and the length contraction formula (2.26) to get the 
second answer. For this second answer, we need the Lorentz factor y' for 
transformation from the frame E to the rest frame of the rod: 


0.6 — (—0.8) ; 1 


f l l 
8! = ——— = 0.946, y = —— = 3.04, 
1 — (0.6)(—0.8) y1 -— p'2 
L 2 
a= = 0.658 m, 
y’ 3.04 
L' 0.658 3 
T = — = —— = 0.23 x 10s. 
B'e 0.946 x 3 x 108 


We shall now convert this proper time into the Lab time T using the time 
dilation formula (2.23). For this, we need the Lorentz factor y, for 


transformation between the Lab frame S and the electron frame E: 


1 1 5 


E y1 — 0.82 -06 3’ 


5 =] Q 
T =7-T = z x 0.23 x 10-8 = 0.38 x 1078s, 


(c) This is left as an exercise for the reader. 


Example 4.4. A proton is moving in the X-direction with velocity 0.999c. 
The rest mass of a proton is mp = 1.0078 u, 1u= 1.66 x 107?” kg. Find (a) 
the energy equivalent of 1 u, denoted as ¢, (b) the energy equivalent of the 
rest mass of the proton, to be denoted as Eo, (c) the Lorentz factor T for the 


moving proton, (d) the momentum p of the moving proton, (e) the energy E 
of the moving proton. Express energy in MeV, and momentum in MeV/c. 


Use the conversion factor 1 J = 6.242 x 101° MeV. 


Solution. Let us denote 1 u as p. Then mp = 1.0078y. 


(a) E = pc? = 1.66 x 1077 x (3 x 10°)? = 14.94 x 10-11 J 
= 14.94 x 107! x 6.242 x 10!? MeV = 932.56 MeV, 
(b) Eo = myc? = 1.0078 E = 939.83 MeV, 


(c) T= — an = 22.36, 


V1 — 0.9992 
(d) p= ['mpv = Pmpcf = Tmp x -= 
= 22.36 x 939.83 x 0.999 MeV/c = 20970 MeV/c, 


(e) E = Tmp = 22.36 x 939.83 = 20991 MeV. 


4.13. Exercises for the Reader I 


Listed below are values of some physical constants which the reader may 
need in working out some of the exercises. (The mass values are taken from 


[16].) 


c = 3 x 10° km/sec., h = Planck’s constant = 6.63 x 10°“ J.sec. 


Masses of e (i.e. electron), p (i.e. proton), n (i.e. neutron), a (i.e. alpha- 
particle), “Li nucleus are to be taken as mọ = 5.49 x 10°4 u = 0.511 MeV/c’, 


m, = 1.0078u, m, = 1.0087u, mg = 4.0026u; m (Li) = 7.0160u; where “u” 


means “atomic mass unit ('*C scale)”. 1u = 931.48MeV/c". 1eV = 1.6021 x 
10°19 J. 


R1 We observe two galaxies A and B moving in opposite directions with 
speeds 0.5c and 0.4c, respectively. What is the velocity of B as seen from 
galaxy A? 


R2 In the Lab frame a 7t is moving in the positive X-direction with velocity 
0.8c and a is moving in the negative X-direction with velocity 0.9c. Find 
(a) the velocity of m in the rest frame of 7t", (b) the velocity of m” in the rest 
frame of 7. 


R3 A meter-stick (proper length 1 m) is moving in the +X-direction with 
velocity 0.6c. An electron is moving in the —X-direction with velocity 0.8c. 
How long does the electron take to pass the meter stick, as measured in the 
Lab frame L? Answer the question in the following three ways and see that 
the answers agree. 


(a) View the motion from the frame L. 

(b) View the motion from the electron’s rest frame E, and convert the time 
thus obtained into Lab time. (Hint: use velocity addition formula and 
Rules 2 and 3.) 

(c) View the motion from the rest frame of the meter-stick and convert 
back to Lab time. (Hint: Use velocity addition formula and LT.) 


R4 Two straight rods R, and R, of proper lengths L, = 15 m and L, = 20m 
are moving in opposite directions with velocities 8=#¢ and §8=#4 
respectively as seen from the Lab frame. Find the time required by the rods 
to traverse the length of each other. Answer the question in three different 
ways and see that you get the same answer. 


(a) View the motion from the Lab frame L. 
(b) View the motion from the rest frame of R, and convert the time thus 


obtained to Lab time. 
(c) View the motion from the rest frame of R» and convert the time thus 


obtained to Lab time. 


R5 How much weight is gained or lost when 1 tonne ice melts? Take latent 
heat of fusion of water as 3.34 x 10° J.kg™t. 


R6 A proton is moving in the X-direction with velocity 0.999c in the Lab 
frame. (a) Find the energy and momentum of the particle in the Lab frame. 
(b) Using energy-momentum transformation equations (4.83) determine the 
energy and momentum of the particle in the frame S under the 
boost:Lab(0.990, 0, 0)S’. 


R7 Answer the same questions, asked in Problem 4.6, for an electron of the 
same energy. 


R8 Answer the same questions, asked in Problem 4.6, for an a particle of 
the same energy. 


R9 Find the momenta and velocities of the following particles. (a) 1 MeV 
electron, (b) 1 MeV proton, (c) 1 MeV a particle, (d) 1 GeV electron, (e) 1 
GeV proton, (f) 1 GeV a particle, 


R10 A hydrogen atom decays from the level n = 2 to n = 1. The energy 
levels of the hydrogen atom is given by the formula E„ = -+4 eV. The rest 
masses of a proton and an electron are mp = 938.3MeV and mẹ = 0.511 
MeV. Using formulas (4.136) and (4.137) determine 


(a) the momentum of the recoiling atom, 
(b) the energy of the emitted photon. 


R11 "Li nucleus consists of 7 nucleons of which three are protons and four 
are neutrons. Determine the energy in MeV required to dissociate this 
nucleus into its seven constituent nucleons. This energy is called the 
binding energy of the nucleus. 


Answers to Selected Exercises 


R10.75c. R2-0.998c. R30.19x 1078s. R4120/7c. R50.37 x 10° 
gm. 


R6(a) E = 20,991 MeV; pc = 20,970 MeV. R6(b) E' = 1631 MeV; p'c = 
1335MeV. 


aA good description of the fission and the fusion processes have been discussed by Samuel 
Gladstone [12]. The mass values have been taken from the same book. 


Somnath Datta [8, Chapter 12]. The CM has been defined on p. 470, the rigid body motion, 
especially the precession of a spinning top with diagrams on pp. 513—521, the two-body system 
with several applications on pp. 521—533. 


Rindler [4, Example 6.5, p. 126]. 
dsee [13]. 
“See [15, p. 203]. 


Part Il 


Amazing Power of Tensors 


Chapter 5 


Let Us Know Tensors 


We shall now make a preparation to launch a vehicle that will take us from 
our mundane three-dimensional physical space, namely the Euclidean 
space, and denoted by the symbol E’, which is encompassed by the three 
Cartesian axes X, Y, Z, or better still, spanned by the three unit space 
vectors i, j, k, to a more esoteric and adventurous world of four-dimensions 
with the addition of just one more axis, the time axis cT. This new four- 
dimensional world will be called Minkowski space-time, and denoted by the 
symbol M4. At the heart of this mathematical construct is Minkowski’s 
assertion (which we are restating in a modern style) that any phenomenon 
in physics must be expressible in the form of a tensor equation in which 
both sides must be a tensor of the same rank, and the same sequence of 
contravariant and covariant indices. Such a statement, without proper 
clarification, will scare the reader, as they have done to umpteen ordinary 
persons, who have been fed with stories and myths of a Relativity demon, 
living in a lofty mountain, beyond the range of ordinary vision. 

Our objective is to demystify this demon. Nothing that Einstein or 
Minkowski said can be above the level of a student who has studied 
calculus and the basic principles of electromagnetism. In fact, the basic 
impulse behind Einstein’s construction of Special Relativity, his puzzle and 
how he overcame it, were worked out in a series of exercises, all of which 
can be a standard set of homework problems in a serious course in 
Electromagnetism at the undergraduate level. 

We have presented an exposition of the original papers of Einstein and 
Minkowsi in two articles which can be downloaded from the website of this 
author (see Preface). 


In this part of this book, we shall make a special effort to explain what 
is a tensor [17]. We encounter this strange object as stress tensor in 
engineering mechanics (sometimes without being aware of it). We shall find 
out how the same object makes a rebirth in M4 as a 4-tensor. It is our hope 
that the step-by-step approach we have undertaken in the following sections 
will equip the reader with the necessary gear for climbing the Relativity 
mountain. 


5.1. Introduction to Tensor 


5.1.1. Vector—tensor analogy 


It may help the reader get an intuitive impression of a tensor if we tell him 
what is common between a vector and a tensor. In fact going by the 
gradation of tensor, a scalar quantity, like the electric charge of a particle 
(e.g., an electron), is a tensor of rank 0. A vector quantity like the 
momentum of a particle is a tensor of rank 1. The inertia tensor of a rigid 
body, from which its moment of inertia about any axis can be obtained is a 
tensor of rank 2. 

In this chapter, we shall use the term tensor to mean a tensor of rank 2, 
even though one can build a tensor of arbitrary rank n > 2, in which we are 
not interested. 

In Fig. 5.1(b) we have drawn a vector V, by which we mean a directed 
straight line segment of a measured length. We have chosen a set of 
Cartesian axes X, Y, Z, and projected the vector on these axes, by dropping 
perpendicular straight lines on these axes, with intercepts Vy, Vy, Vz. We call 
these intercepts the scalar-components of V associated with the respective 
axes, their directions represented by the unit vectors (also called the base 
vectors) ex, €y, €z. We can then write V either as a column matrix, or as a 
linear superposition of the base vectors with (Vy, Vy, Vz) as the coefficients. 


Vz 
V = | Vy | = Vz€z + V,e, + V.¢,. (5.1) 


We can project V on any arbitrary direction represented by the unit 
vector n and get the intercept V,, and call it the scalar component of V in 


the direction of n. Mathematically, we obtain V, by taking a dot product of 
V with n. Let n = nye, + nye, + n,e,. Then 


EE EE SSS 
- Real line O Real line + 


= 


Fig. 5.1. Scalar, vector and tensor as geometrical objects. 


V,=V-n=] Vy | (nzn,ynz) = Vna + Vyn, + Venz. (5.2) 


By analogy, we can think of a mathematical object ¢ having vector- 
components Tx, Ty, Tz associated with the axes X, Y, Z, as shown in Fig. 
5.1(c), and call it a tensor. We can then write 


Ty 


T= |T, |= Te, + T,e, + Teer. (5.3) 
T: 


We can take a dot product of T with n, and call it the vector-component 
of T associated with the direction n = n,e, + n,e, + n,e,, and represent it by 
I: 


Th =T-n= Ty (Ma My Nz) = Tana + Tyn,y + Tanz. (5.4) 


Note the most distinguishing features of a vector and a tensor. The 
scalar components (Vx, Vy, Vz) uniquely determine the vector V. The vector 


components (Tx, Ty, T,) uniquely determine the tensor Î. 


Scalars, vectors, tensors are all geometrical objects® illustrated in Figs. 
5.1(a)-5.1(c). They can be represented, respectively, as a point (on the real 
line), a measured and directed straight line V, as a triplet of straight lines 
(T,, Ty, T;). 

We shall now take a rigorous look at tensor, using a language which can 
be somewhat abstract. 


5.1.2. Linear operator in a vector space 


We shall begin by explaining what we mean by linear operator in a vector 
space. 

By the three-dimensional linear vector space y we mean the set of all 
vectors A, B, C, . . . we can think of and all such vectors we can construct 
by combining them linearly, e.g. nA + AB where n, A are real numbers. 

Let us think of two vectors C and D having Cartesian components (C,, 
Cy, Cz) and (Dx, Dy, D,) and related to each other in such a way that the 
values of the former determine the values of the latter. This means that C is 
an independent vector and D is a dependent one. In other words, D is a 
function of C. Let us further assume that D is proportional to C. That is, if 
for example we double C, then D is doubled. These two vectors, however, 
may or may not be in the same direction. In that case, we say that a linear 
operator © transforms C into D. We may like to write this transformation 
symbolically as 


Âl C) =D. (5.5) 
The property of linearity means that 


If O(C) = D and O(E) = F, then O(aC + bE) = aD + bF, (5.6) 


where a, b are two arbitrary scalar constants. 

In Fig. 5.2 we have shown two simple examples of how the operation 6 
can take place. In Fig. 5.2(a), we have shown a particle of constant mass m 
in arbitrary motion along some trajectory F. At some instant of time t it has 
velocity v. Therefore, its momentum at the same instant is p = mv. We can 
therefore think of the operator 6 transforming velocity v into momentum p 
by scaling the length of the former by the factor m without changing its 
direction. 


(a) (b) (c) 


Fig. 5.2. Two examples of how a linear operator ¢j transforms a vector into another vector: (a) ĝ 
acting on v yields p;(b,c) ĝ acting on œ yields L. 


In Fig. 5.2(b), we have shown a rigid body rotating about some axis 
pointing in the direction of the unit vector n with angular speed œ, so that 
its angular velocity is œ = œn. Its angular momentum is L, which (in 
general) does not coincide with the direction of w. In this case, the operator 
© transforms the angular velocity œ into angular momentum L by changing 
the length as well as the direction. The linear operator ð in this case is the 
inertia tensor Z about which we shall give some more insight in Sec. 5.1.5. 

For our immediate purpose, we shall look upon a tensor f as a linear 
operator. The linear operation mentioned above suggests that { can be 
represented by a matrix, and the “tensor operation” can be represented as a 
matrix multiplication. This will become evident in the next section. 


5.1.3. Tensor as a dyadic 


Two arbitrary vectors A, B can be combined in three types of 
“multiplication operation”, the first two of which the reader is familiar with, 
namely, (1) the dot product A - B which is a scalar; (2) the cross product A 
x B which is a vector. Now comes (3) the third type, namely the dyadic 
product AB, which is a simple juxtaposition of the vectors, without any dot 
or cross in between, which we shall call a dyad.” 

We define the dyad AB to be a linear operator which converts any 
vector C to another vector D and this conversion can be done in either of 
the following two ways: 


operating on the right: AB - C Œ A(B-C)=nA 
where 7 = B- C = scalar; (5.7a) 
operating on the left: C- AB tef (C. A)B = ÀB 
where \ = C. A = scalar. (5.7b) 
The linearity property follows from the operation defined in (5.7). Also 
note that in general, AB # BA. 


We shall write the sum of two dyads AB and EF as AB + EF and define 
it by the distributive property: 
(AB + EF) -C © AB.C+EF-.C=A(B-C)+E(F-C), 


(5.8) 


C- (AB + EF) = C . AB +C. EF = (C - A)B + (C - E)F. 
It should be a simple exercise to show from Eq. (5.7) that the dyadic 
product is distributive, i.e. if E, F, C are three arbitrary vectors, then 
(E + F)C = EC + FC, 


(5.9) 


C(E + F) = CE + CF. 


As a corollary, 


(A+ B)(E+ F) = AE + AF + BE + BF. (5.10) 


A sum of dyads can be called a dyadic. We shall prefer to use the term 
“dyadic” as a general name for sums of dyads as well as individual dyads. 


We shall frequently use the symbols e,, ey, €; to represent unit vectors in 


the directions of the X-, Y-, Z-axes, for which we had used i, j, k earlier in 
this chapter. As we progress, we shall use another set of symbols e4, eo, e3 
to mean the same unit vectors. This transition (i, j, k) > (ex, ey, ez) > (e1, 
e, e3), side by side with (x, y, Z) > (X1, Xo, X3) will restore symmetry and 
help us use Einstein’s summation convention (following Eq. (5.16). 

The unit vectors (e4, €2, €3), in both Cartesian and spherical coordinate 


systems, form an orthogonal right handed triple and this property is 
expressed as 
€1 -@9 = @9-e3 = eg : €1 = 0, 


@; -@; = €92 - Cp = Cg - €63 = 1, (5.11) 


€1 X @2 = @3. @2 X @3 = ej; O38 X Cj = CQ; 


or, more compactly as 


Qe, ' €j = ò 
(5.12) 
@i X €j = Eijk@k, 
where bij, called Kronecker delta, and Eijk called Levi-Civita Symbol, are 
defined as 
2 = = l, if l = S, 5.13 
Ô; =" = 0 = (5.13) 
= 3 0, ifi Fj. 
1 if ijk = 123, 231, 312, 
Eijk = $ —1 if ijk = 213, 321, 132, (5.14) 
0 ifi=j, orj=k, or, k =i. 
Let us now consider the set of 12 dyads: {exex, ey, @x@z, . . . , €zez}. 


Using them we can construct the following dyadic 


T= Trr€res T Tyz€yez ita 3 T ys eyez T Tzz€z€z 


T;;€;@;, (5.15) 


> T;j€;€j 


3 
i= 1 


3 


1 


where the subscripts (1, 2, 3) represent (x, y, z), respectively. That is 


€] Æ €z; eg =e,; es =e,; 
and, Tii = Tzs; Ti2 = Tay; ; (5.16) 
T32 = Ty; T33 =T 


are arbitrary real numbers. 

In the second line of Eq. (5.15), we have introduced Einstein’s 
summation convention : sum over repeated index, without explicitly 
inserting the sum symbol %. The subscript “i” appears twice, implying a 
sum over i. The subscript “j” appears twice, implying one more sum, this 
time over j. 

The mathematical object T appearing in Eq. (5.15) is what we shall call 
a tensor for all purposes in this book. The set of dyads {e,e,, exey, €x€,, .. - ; 
e,e,} can be looked upon as a complete set of base dyads forming a basis B 
in the tensor space t of T. This is analogous to the way that the vectors {e,, 
e,, €z} form a basis g in the vector space v of V. Any arbitrary vector V can 


be written as a linear superposition of the base vectors as 


V = Vt + Vyey + Ve, ; (5.17a) 


where V, = V - ez, V, = V-e,, Vz = V -e,, (5.17b) 


are the Cartesian (scalar) components of V in the basis g. In the same way 
any arbitrary tensor T can be written as a linear superposition of the base 
dyads, as in Eq. (5.15), where the nine quantities {T,,, Tyy, ... Tz, Tzz} are 
to be interpreted as the Cartesian (scalar) components of ¢ with respect to 
this basis £. 

From the definition of dyad given in (5.7), and the orthogonality of the 
base vectors {@,, €y, €z}, i.e. 


ej :@€k =O, j,k =1,2,3=2,y,2, (5.18) 


it should be apparent that the base dyads operating on any arbitrary vector 
V will yield the following vectors: 


e,e, - V = ezVz; Grey ' V =e;\ y; rss 5 @z@y':' V =e,V,; 


j (5.19) 


V. ezer = Vz€z; V. e,e, = Vie ee 


yiit';V-e,e, = V,e,. 


Hence, if A = A,e, + Aye, + A,e, and B = Bye, + Bye, + Be, are two 
arbitrary vectors, then, A. T. B® A.(T.B) = AjTijB; =(A-T)-B. 


A-T.B=A,T;B; (5.20a) 


Special case: e; 7. e; = Ti. (5.20b) 
If the nine components {Tj} of a tensor f are given, the tensor can be 


constructed using Eq. (5.15). Conversely, if a tensor î is given in the form 
of a mathematical relation, its nine components T;; can be retrieved by 


means of Eq. (5.20b). 
Using the distributive property given in (5.10) it is seen that the dyadic 
product of A and B has the following dyadic representation: 


AB = A, B,e,e, + A, By,e,e, + --:+A,B,e,e, + A,B,¢,e, 
= A; Bjeje;. (5.21) 


Hence, if we write 


T=AB, then Ty; = A;B;. (5.22) 


Using Eq. (5.7a), the operation of the tensor T on the vector C = Cex 
placed on the right works out as follows: 


T -C = (T;,e;e;) - (Cker) 


= TijCkeilej - ex) 


= ei(Ti;C; ). (5.23) 


We have used the orthogonality relation (5.18) to get to the last line. 
In a similar way, using Eq. (5.7b), the operation of the tensor T on the 
vector C = C,e, placed on the left works out as follows. 


C -T = (Crer) - (Tijeie;) 


Cy Ti; (ex + e4)e; 


a | 
NM 
ro 


= (CT; )e;. (9. 


The above two equations suggest that if we write D =f: C and F=C- T, 
then the Cartesian components (D1, D2, D3) of D and (F4, Fo, F3) of F can 
be obtained from matrix multiplications: 


Dı Tu Ti2 Tis Cı 
Dz | = | Ta Tz Tz C2]. (5.25a) 
D3 T31 T32 T33 C3 
Ta T2 Tis 
(Fi Fo Fs)=(Cr Co C3)| Ta Too Tz (5.25b) 
T31 T32 Tas 


In the above equations, starting from Eq. (5.7), we have used a dot (-) to 
separate the tensor from the vector on which it is operating. We shall 
frequently refer to a tensor operation as a dot product between the tensor 
and the vector. Equations (5.25a) and (5.25b) show that a dot product 
actually involves a matrix multiplication. A tensor is to be represented as a 
square matrix, and a vector either as a column matrix or a row matrix, 
depending on whether the tensor operation is on the right or on the left.° 


Tii Tiz Tis Ci 
T=|T Ta Ts |=(T, C= |c | ={C}, 
T31 T2 T33 Cy (5.26) 


F=(Fi F F3) =(F). 


In the above equations, we have adopted the convention of indicating a 
3 x 3 square matrix by [ J], a3 x 1 column matrix by {} ,and a 1 x 3 row 
matrix by ( ). Hence, Eqs. (5.25a) and (5.25b) can be written as 


{D} = [THC}, (F)= (C \[T]. (5.27) 


It follows from Eq. (5.21) that the matrix representation of the dyadic 
AB is 


AiB, AiBo AıB3 
AB = AB, Ao Bə ABs . ( 5.28 ) 
A3B, A3B2 As3B3 


We shall define the dot product of two tensors § and Î as the tensor 
hk = §.7T by its operation on an arbitrary vector C on the right in the 
following way: 


($-T).c%S8.(F.C). (5.29) 


From this, it follows that the matrix representing R is given by the product 
of the matrices representing § and Tf. That is, 


[R] = [S][T], implying: Ri; = SikTkj. (5.30) 


It is then obvious that, in general, S-T#T-S. 

Using the matrix representation as given in Eq. (5.30), and the tensor 
operation on the left as found out in (5.24), we can now see how the product 
tensor Ñ = § - T will act on the left. 


C : R = ( Ck Rki je; = (Cy Ikm Tmj je; 


= Í Ch Skm ) { Tmj®; ). ( 5.31 ) 


or C-(§-T) =(C-S).-T. 


We can extend the definition of matrix product to any number of 
tensors, by writing the matrix representation of the product tensor as the 
product of the representative matrices of the component tensors. For 
example, 


ifR=A-B-C, then [R] = [A] [B] [C]. (5.32) 


At this point we shall add a word of caution. A tensor is not the same as 
a square matrix, just as a vector is not the same as a column matrix or a row 
matrix. The row matrix shown in Eq. (5.26), for example, gives the 
components of the vector F in a given coordinate system XYZ. As the 
coordinates are changed from (x, y, z) to (x', y’, z'), the components will 
transform from (Fı,F2, F») to (F{,F3,F3). However, the vector F itself is a 


“geometrical object” (a straight line of measured length pointing in an 
assigned direction) which remains invariant under all coordinate 
transformations. In the same way, the tensor T is a geometrical object, 
which remains invariant under all coordinate transformations, even though 
its components will change from the square matrix [Tj] to another square 
matrix [7/;] under the same coordinate transformation. 

Yes, the components of all tensors will transform, except the 
components of the identity tensor which we shall introduce in the next 
section. They will remain the same, the same as in (5.34), following any 
coordinate transformations. 


5.1.4. Identity tensor, completeness relation, components of a 
tensor in the spherical coordinate system 


In matrix multiplication one needs the identity matrix i which in the 
present context, is the matrix representation of the identity tensor, also 
known by the alternative name idemfactor. It will be recognized by the 
symbol i. Its sole property is that when it operates on any vector V, either 
on the right, or on the left, it gives back the same vector.on the right, or on 
the left, it gives back the same vector. 


LVV, V-1%V. (5.33) 
Such a tensor must have 1 for its matrix representation. The dyadic 


representation (shown below) follows from the above property and the 
orthogonality relation (5.18). 


1=1—10 1 ot, (5.34a) 


1 = e,e, + e,e, + e,e, = Ẹ;@;. (5.34b) 


Equation (5.34a) gives the Matrix representation, and Eq. (5.34b) the 
dyadic representation. 


It will be advantageous to write the tensor T in a curvilinear coordinate 
system, in particular, spherical coordinate system. For this purpose, we shall 
write down the transformation equations for the coordinates and the base 
vectors: 


r=rsinécosea, (0 <re< o0], 
y =rsinĝsinġ, [0 <8 <7], (5.35) 
*=rcosé, [0 << 27]. 


In the above equations, we have indicated the “ranges” of the three 
coordinates within the [ ] brackets. 


er = sin #(cos Qer + sin ge, ) + cos fez, 
e = cos (cos ġe, + sin Qey) — sin ée,, (5.36) 


ep = — SİN ge, + COS Qey. 


Using these equations (and remembering that e,eọ ~ ege,, for example), 
it should be a simple exercise to show that 


e,e, + egeg + Ege, = Crer + e,e, + e,e, = 1. (5.37) 


If we have three unit vectors {a, b, c} which are mutually orthogonal at 
every point in space and such that 


aa + bb+ cc =1, (5.38) 


then we say that these three vectors form a complete orthogonal set, and 
hence a basis, so that any arbitrary vector V can be represented as a linear 
superposition of these three vectors.’ This should be clear from the 
following: 


V = V .-Î = V . (aa + bb + cc) = Vaa + Vab + V-C, 
(5.39) 
where V,=V-a. %=V-b. V-=V.c, 


are the components of V in the directions of {a, b, c}, respectively. Using 
the completeness property, it can be advantageous to write a tensor in the 


following style. 


T-1-T-i= (aa + bb+ ce) Tv. (aa + bb + ce) 
=T,,aa+T,ab+T,.ac+---+T7.4cb+ Taece, where (5.40) 


T..=a-T-a, Ta = a-T-),..., Ta =c-T-b6, Te =c. T.c 


are the components of with respect to the basis {a, b, c}. 
We shall illustrate the operation shown in Eq. (5.40) by writing the 
tensor Î in Cartesian and spherical coordinate systems: 


T = (e,e, + eye, + €zez). 7. (e,e, + eyey + ezez) 

= T,7e2@, + Tzyerey + Tr2@c2 + +++ + Tez@zey + T22€2€,, (5.414) 
Tzz = @z Te. Try = @z-T 0, err Tay =e, :T-e,, (5.41b) 
Tir = @; .T .6,. (5.41¢c) 


T = (e,e, + egeg + C404) ` T - (ee, + egeo + e48) 


= T,,e,e, + Troereo + T,gereg + +--+ Tyeeges + Typegeg, (5.41d) 


Pa ~ 


Tr =e,:T-e,, Tyg =e,-T -e9,...,T go = eg -T - &9, (5.41e) 
Te = @g- T. eg. (5.41f) 


Equations (5.41a)-(5.41c) represent the tensor T in a Cartesian 
coordinate system, and Eqs. (5.41d)-(5.41f) in a spherical coordinate 
system. 

We can then write the components of T in the following matrix forms: 


Tzs Tz y Íz z Tyr Tye Tro 
= (Cart) ~ = (sphr) j a‘ 
T— ‘ie f je on T— Tor Too Tog . (5.42) 
Ta T T y Ts z Tor Toe Too 


The first matrix gives the Cartesian components, and the second one the 
spherical components. 

Using the transformation of the base vectors (5.36), and the 
completeness relations (5.37), one can transform the Cartesian components 
to spherical components, for both vectors and tensors, as we shall show. For 


this purpose, we shall temporarily denote the spherical base vectors with a 
prime, ie. {V} {T} 6j =r,0,] and make a table of transformation 
coefficients {c;;}: 


where cij =e)-e;: i=r,0,0; j =2,y,2. 
i , ; (5.43) 
sinfcos® sin@sin® cosg 
= | cos@cos® cos@ésing —siné 
— sing COS © 0 


Now, let V be a vector and T be a tensor with Cartesian components 
HV}, {Ty i,j =x, y, z], respectively. Then the spherical components of the 


same vector and tensor, namely, [{V}; {T4} ij = r,0,¢] will be obtained in the 
following ways‘: 


Vj =V-e = V-e,e,- €; = Vi, (5.44a) 


T;; = ej- T.e} = ej -exex - T - exe; - ef 
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= Cin Cpl ki- (5.44b) 


Note that we have used the summation convention: sum over k in (5.44a), 
sum over k, l in (5.44b). 
We shall illustrate the transformation formulas (5.44) with two 
examples, i.e., V- = Vf and Tp» = TY). 
V, = sin 8 cos @V, + sin f sin @V, + cos @V,, 
Tg = sin @ cos 0(cos ĝ cos oT... + cos @sin oT, — sin T, ) 
+ sin ĝ sin ġ(cos 0 cos Tys + cos@sin T yy — sin 0T yz ) 


+ cos (cos Â cos óT,» + cos Â sin @T,,, — sin@T,, ). (5.45) 


5.1.5. Example: Inertia tensor 


We shall illustrate the tensor concept by showing two important examples, 
namely (1) the inertia tensor and (2) the stress tensor. We shall take up a 


short discussion of the first example in this section leaving the second 
example, which needs a more detailed coverage, to the next section. 

In Sec. 5.1.2 we talked about the tensor operation converting the 
angular velocity œ into angular momentum L. The corresponding operator 
is the inertia tensor ZT of the rigid body. Its dot product with the angular 
velocity w gives the angular momentum L of the rigid body. That is, 


L = 2 -is. (5.46) 


We shall find an expression for the vector angular momentum L of a 
rigid body which is rotating about a point O (which can be a moving point, 
e.g. the CM) with angular velocity œw = wn about the axis pointing in the 
direction of the unit vector n. Let j be one of the constituent particles, 
having mass mj, and located at the radius vector r; with respect to O, as 


shown in Fig. 5.2(c). The velocity of this point is vj = œ x rj. Therefore, this 
particle has an angular momentum with respect to the point O, equal to 


£; =r; X pj =r; X MV; = mjr; x (w x rj) = mj[rjw — (r; -w)r;]. 


(5.47) 


Assuming that the rigid body is made of N particles (which is a very large 
number), we add the angular momentum of each particle to obtain the 
angular momentum of the rigid body about the point O, given as 


N 


Lo = ` mj[riw — (r; -w)r;]. (5.48) 
We can write the quantity within square brackets as 


rew — (r;-w)r;| = r?1 —T;r;| w, (5.49) 
j J J j J% 3 


and construct the inertia tensor as the dyadic (sum of infinitely small dyads) 


N 
7 25 ir ” \ 
= J m,;[r51 — rjrj]. (5.50) 


7—U 


Then, we get the angular momentum as the dot product 


Lo =7-w. (5.51) 


We have thus derived Eq. (5.46), and along with it have found an 
expression for the inertia tensor in Eq. (5.50). Note that the expression 
within the square brackets is the difference of two dyadics, namely, the 
identity dyadic i multiplied by the scalar rf, and the dyadic product of r; 
with itself. 

For further clarification we shall write down the components of the 
tensor. Assuming that the rigid body has uniform mass density p distributed 
over its volume V, the sum in Eq. (5.50) becomes the integral: 


=p JI] [r?T = rr}ar. (5.52 ) 
JJ JY 
Some of its components are 
-=e fff [r? — ?]a? r= =o [Í] (y? + 2?)dr; 
JV 
Toy = —P III, (zy)dr; etc. 
JJ JV 


It is now seen that the inertia tensor is a symmetric tensor, i.e. 


(5.53) 


Lay = Tja; Tyz = Day; y Aa = Zs: (5.54) 


This symmetry property is preserved under all coordinate transformations. 


5.2. Stress in a Medium 


5.2.1. Stress vector 


By (mechanical) stress we mean internal forces (in the form of 
intermolecular forces) called into play when bulk matter, either in the form 
of solid, liquid or gas, is subjected to external forces. These internal forces 
exist throughout the bulk matter and its mathematical expression is given by 
a stress tensor field T(x, y, Z). 

For simplicity we shall consider a solid block in Fig. 5.3(a). It has been 
cut into two parts, the upper block ų and the lower block ¢, by an imaginary 


plane Ł, leaving a trace I’ of its boundary. This plane is identified by the 
unit normal vector n pointing from the lower block to the upper block. 


Fig. 5.3. Explaining the stress tensor. 


In Fig. 5.3(b), we have shown the lower block £ with the plane of 
separation & exposed. Let us consider a small area da at the point P(x, y, Z) 
inside the solid, but lying on this plane. Then the stress vector F” (x,y,z) is 
defined to be the force per unit area at P(x, y, z), exerted by the atoms of the 
upper block yu on the atoms of the lower block £ across the plane n. The 
infinitesimal force acting on the area da is then 


dE") = F” (x,y,z) da. (5.55) 
Note that in general the direction of the stress vector 7(x,y,+) is 
different from the direction of the normal n. If, however, 7 (x,y, =) || n (i.e. 


perpendicular to the plane), the stress (vector) is called normal stress. If 
T (x,y,z) L n (i.e. parallel to the plane), it is called shear stress. 


5.2.2. Stress tensor 


Let us go back to the matrix representation of T given in Eq. (5.42), and be 
specific that we are considering only the Cartesian components. Take the 
dot product of T with e,, using the dyadic form (5.15), and call it the vector 


component of T associated with the X-direction, and write it as To: 


T°?) —T.a, = T;,e @; ° ey =T 


Lda | jie; = Tzr€z + T yz, + Tents (5.56) 


Comparing with (5.42) we notice that the columns 1, 2, 3 of the 
Cartesian matrix T represent three vectors TY, TO), TS, respectively, each 
as a column matrix. 


Tzs Tay Tazz 
t= Ts De PH TT), (5.57) 
fT 
Tra i= i. 
T™ = | To |; T =| Ta l; T =| na l. (5.58) 
Ts (H Tos 


Now let n = ne, + n,e, + nye, be a unit vector, representing some 
direction in space. Let us construct the dot product of with n 


T™ ad (5.59) 


We shall call T“ the vector component of the tensor T associated with the 
direction n. 

The above dot product operation can be represented as the matrix 
multiplication: 


Ts” i PEENES: ae i n, 
T n) _ T$” ) = Tyz Ty ly i n y : ( 5.60 ) 
rin) Tin Toy Tig} \n 


or, More compactly as 


TT = (T) TY) T2) ) ny = Tn, ai Tn, aT fis: (5.61) 


Let us specialize T to stress tensor 7. 


Th PH T” 
Y 4} 4 
T=] Tæ Tay Ta (5.62) 
Tus Tou Ty 
Tez Tay Ta 


Note from the above equation that in 7 the second index j is the 
“surface index’ (indicating the direction of the surface on which stands the 
stress vector 74) and the first index i the “component index” (indicating x, 
y, z components of 7) 

Now rewrite Eq. (5.61) as 


Ts) = Tn, + Tny + T n,. (5.63 ) 


The above equality involving the stress components can be proved using 
Newton’s second law of motion applied to a fluid in motion or a solid under 
deformation. In other words, the stress vectors on three perpendicular 
surfaces determine the stress on any other surface pointing in any arbitrary 
direction. Therefore, stress 7 is a tensor as per the qualification written in 
Sec. 5.1.1 

In Fig. 5.3(e) we have shown the upper part of the solid of Fig. 5.3(a), 
and the same area da as in Fig. 5.3(b), but now on the upper block y. The 
normal vector now is —n, and the stress vector is 


T—") (a, y,2) = T(z, y,2) -(—n) = -T™ (2, y, 2), (5.64) 
so that the force exerted by the atoms of the lower block £ on the atoms of 


the upper block ų across the same area da is dp") = -T™ da = -dF™. Which 
is in conformity with Newton’s third law of motion. 


In obtaining the last equality in Eq. (5.58) we have used the linearity 
property of the tensor as stipulated in (5.6). In this case T - (an) = af ; n 
where a = -1. 

Like the inertia tensor, the stress tensor is a symmetric tensor, i.e. 


Tay = Tim; Tys = T: ys T: = Taz- (5.65) 


which can be proved using the equation of motion of the angular 
momentum. 


5.2.3. Diagonalization of a symmetric tensor 


A symmetric tensor can be always diagonalized. By this we mean the 
following. Let T be a symmetric tensor. Let its components with respect to 
some axes (XYZ) be {Tj} and that Tj; = Tj;. By a suitable rotation of the axes 
(XYZ), one can arrive at another set of axes (XpYoZo) such that Tj = 0 if i # 
j. 

We can express this formally in the form of the following equation: 


i in Tz y Ti z Tı U U 
T=| 2s Ty T +> T=/[0 h 0 _ (5.66) 
T; Zz T, y T, z XYZ 0 u T3 XoYoZo 


The axes (XoYoZo) are called the principal axes, and the diagonal 
components (Tj, T2, T3) are called the principal moments of inertia in the 
case of Inertia Tensor, and the principal stresses in the case of Stress 
Tensor.' 


5.2.4. Gauss’s divergence theorem for a tensor field 


When we say tensor field, we mean a physical quantity represented by 
atensor T(x, y, z) whose nine components T,,(x, y, Z), Txy(X, y, Z), - . - , Trz(x, 
y, Z) are defined at every coordinate point (x, y, z). We assume that these 
nine components are all differentiable functions of the coordinates x, y, z. 
For such a tensor field, we define its divergence to be the formal dot 


product of the grad operator V with the tensor T(x, y, z), it being assumed 
that V will appear on the left. 
Let us write the tensor T by the dyadic representation 


T = Te} + Te, + Te,, (5.67) 


as in Eq. (5.3). Then 
dvT = V-T=V-(T@e, + Te, + T#e,) 
def iw . T(z) le, + (V-T je, +(V-T' z) je. (5.68) 
Note that V - T®, V - TO, V - T® are the familiar scalar divergences of the 
vector fields T®, TX), TØ respectively, 


ype) Te | Ty | Tya 


ör Oy Oz 
OT. T, OT, 
Vv . TY = Ae ae J + n>? (5.69) 
OL Ot O2 
> OT, OF, T; 
VTO === 4 —# +, 
ör Oy Oz 


and constitute three (scalar) components of the vector V - î along the X-, Y- 
and Z-axes, respectively. Combining (5.68) and (5.69), we get 


In the second equality, we have employed Einistein’s summation 
convention (introduced on p. 147). 

The divergence of a vector field is sometimes interpreted as “outflux per 
unit volume”. This association of divergence with outflux is due to Gauss’s 
divergence theorem briefly recalled in Sec. 11.2.2. Applying the divergence 
theorem, to the three vector fields T®, T”, TË separately, we get the 
following three equivalence relations: 


JII V. T (r)r = // T(r) - n(r) da, (5.71a) 
V S 
J V. TH (r) Pr = If T) (r) - n(r) da, (5.71b) 
V S 
H V.T (rjr = // T(r) - n(r) da. (5.71c) 
V S 
Multiplying either side of Eqs. (5.71a)-(5.71c) with ex, e,, e, respectively, 
and adding, we get 
JII Vv . (Te, a Te, ae Te, )d?r 
p ? 
= Jf (Te, + Te, + Te.) -nda. (5.72) 


Identifying the dyadic within the parentheses as the tensor T, we obtain the 
divergence theorem for the tensor field: 


(5.73) 


We shall find this theorem to be crucial for constructing Maxwell’s 
stress tensor in the next chapter. 

Since the stress tensor 7 is symmetric, we can write its divergence as 
follows: 


Le; =e,— =e,T;; ;. (5.74) 


In the last equality, we have adopted the convention ®; = =. The jth 
Cartesian component of the divergence theorem (5.73) can therefore be 


written as follows: 
I/I Tier- ff oe (5.75) 
V Ov; Ss 


We shall find this practice useful while writing covariant equations in the 
context of the Special Theory of Relativity. 

Most authors prefer the expression (5.74) for the divergence. However, 
following our chain of discourse leading to the construction of the stress 
tensor, the expression (5.70) seems to be most natural. 


5.2.5. Volume force density in a stress tensor field 


Figure 5.4 shows an imaginary rectangular box abcdefgh of infinitesimal 
dimensions 6x, dy, 6z inside a medium under stress (which may be matter, 
or field). The centre P of this box is located at the coordinates (x, y, z). Let 
us assume that the stress in the medium is given by the tensor field (x, y, 
z), whose components are differentiable functions of the coordinates. We 
shall find the total force on this box due to this stress. 


Fig.5.4 Stress force on a volume element. 


We have shown in Fig. 5.4(a) the outward normal vectors (ex, ey, €z) on 
the three faces of the box that are exposed to our view. The outward 
normals on the other faces which are hidden from our view are (—e,, —ey, 
—e,). We shall identify each one of the six surfaces of the box by their 
outward normal vectors. 

Let us consider the opposite faces abcd and efgh, recognized by the 
normals (ex) and (—e,). The locations of their centres are (x+ %,y,) and 


(æ — ,y,2), respectively. The stress forces on these two faces are 


. ~ Ôr a lz bx oS 
Fig = T (2 + mY, z) -(+e,)dydz = T? (0 + 3 OY: z) OYyor 


P aT r|. 
= re (x,y,z) + = OYyoz, 


OF 


é IT zx) é IT ' T) 


Or OF 


oF. + OF _, = 


where óV = dxdyéz is the volume of the infinitesimal box. In the same way, 
we find the forces on the other four faces of the block. Adding the stress 
forces on all the six surfaces, we get 


as the total stress force on the box. The volume force density f,, which gives 
the stress force acting per unit volume of the media under stress, is then 


given as 


Fig.5.5 Stress forces on a bulk volume. 


or 


One may conclude that total stress force F, on a bulk volume y carved 


out inside a medium , as shown in Fig. 5.5(a), is the volume integral of the 


force density f, carried out over the entire volume V. We shall carefully 
analyze the forces inside the medium before jumping into this conclusion. 

Let us consider a two-dimensional view of nine tiny, imaginary 
neighbouring blocks lying inside the medium and forming a group . We 
have marked the blocks as A, B, C, D, E, F, G, H, K, with A at the centre. 
In Fig. 5.4(b) we have shown the forces on the four sides of A as Fy, Fo, F3, 
F4. The force F; comes from the neighbour B, and by Newton’s third law of 
motion, A applies an equal and opposite force —F, on B. Similarly, the 
forces F», F3, F4 come from the neighbours C, D, E. And A applies equal 
and opposite forces —F>, —F3, —F, on them. It may then appear that these 
internal forces, when added together, get cancelled out and there should not 
be any stress force on the group at all. 

A close examination will disprove this judgement. We have surrounded 
by an imaginary boundary surface x. It is now seen that even though the 
action—reaction forces cancel out in the interior of the group , they survive 
on the boundary surface &. These surface forces Fp1, Fp2, . . . , Fpi2, when 
added together constitute the total force F, on the group . 

In Fig. 5.5(c), we have divided the volume Y into an infinite number of 
infinitesimal blocks. The interior stress forces between adjoining blocks 
will cancel out. However, the forces on the boundary surface, some of 
which we have shown as Fp1, Fp2, Fp3, Fp4, will survive and add together to 
constitute the net stress force F, on the volume Vv. 

We now get a clue of how to find the net stress force F, on the volume y 
. In Fig. 5.5(d), we have shown the volume y once again. At a certain point 
P on this surface we have pictured a tiny patch of area da, on which we 
have drawn a unit outward normal n. The stress force on this patch is df, = 


T® da = 7 -n da. Integrate this force over the entire boundary to get Fs. 
We shall perform this integration and convert the surface integral into 
volume integral by applying Gauss’s Divergence Theorem as derived in Eq. 
(5.73): 


which reconfirms Eq. (5.79) 
The simplest example of stress field in matter is provided by a perfect 
fluid, which by definition, does not support shear stress. Since tensile stress 


is also ruled out in a fluid, the stress field inside a perfect fluid is left with 
only normal compressive stress, which is known more familiarly by the 
name pressure, to be written as p(x, y, z). It is then obvious that the stress 
field in a perfect fluid is the “pressure tensor”, having only three equal 
diagonal elements p(x, y, Z): 


According to Eq. (5.79) the volume force density inside a pressure field is 
given as 


and the stress vector on a surface n as 


ê As stressed by Misner et al. [5, Sec. 2.2]. 


Psee a good book on intermediate mechanics, e.g. Ref. [18, Chapter 10], which also discusses Inertia 
Tensor in detail. 


Tn Quantum Mechanics (QM), a clear distinction is made between a vector A on left and a vector B 
on right, as in the scalar product A - B. The former is called a bra vector and the latter a ket vector, 
and together, in the scalar product, they constitute a bra-ket: A > (A|; B > |B); A -B > (AJB). 
However, these vectors are in general infinite dimensional, their components are complex numbers, 
and the components of the bra vector (A| are complex conjugates of the respective components of the 
ket vector |A). 


din QM, the completeness of a set of orthonormal vectors {|uj); i = 1, 2,...,0o} is expressed through 


the statement %j|u;)<u;| = 1. This relation is used to change the representation of a Hermitean operator 


T, the equivalent of the tensors we are considering here. The (i, j) component of T will then be 
written as Tjj = (uj|L|uj), the equivalent of Eq. (5.20b). 


“In Tensor analysis, the primary language of the theory of relativity, the rule of transformation has 
different forms for contravariant and covariant vectors, and for contravariant, covariant and mixed 
tensors. The rules we are establishing here are different from them. The components of vectors, 
tensors we are using may be called physical components, in contrast to their contravariant and 
covariant components for which a more elegant transformation rule is used. 


fsee [18, Sec. 10—4]. 


Chapter 6 


Maxwell’s Stress Tensor 


6.1. Introduction 


‘Action at a distance’ (AAD) was an enigma to natural philosophers, from 
Rene Descartes? (1596-1650) to James Clerk Maxwell (1831-1879). We 
find an account of the evolution of physical concepts in [19]. According to 
Descartes, space was a plenum, a medium called aether, capable of 
transmitting force on material bodies. “It was to be regarded as the solitary 
tenant of the universe, save for that infinitesimal fraction of space which is 
occupied by ordinary matter.” 

Subsequent theoretical physicists and mathematicians, Robert Hooke 
(1635-1703), Isaac Newton (1642-1727), Reimann (1826-1866), W. 
Thomson (1824—1907), Maxwell and others lent their support to this view. 
Implicit in their belief was the assumption that force cannot be transmitted 
except by actual pressure or impact. AAD was a taboo, as abhorrent as 
witchcraft: I wave my hand here and a fire is ignited there. In order to 
support their faith in aether they contrived every possible idea, any possible 
mechanical model, to make aether viable. 

According to Newton “All space is pervaded by an elastic medium or 
aether, which is capable of propagating vibrations in the same way as air 
propagates the vibrations of sound. This aether pervades the pores of all 
material bodies, and is the cause of their cohesion; its density varies from 
one body to another, being greatest in the interplanetary space.” 

Maxwell inherited this legacy. We shall quote a few passages from his 
celebrated paper A Dynamical Theory of the Electromagnetic Field read to 
the Royal Society of London on December 8, 1864 [20]. 


“(1) In this way mathematical theories of statical electricity, of magnetism, of the 
mechanical action between conductors carrying currents, and of the induction if 
currents have been formed. In these theories the force acting between two bodies is 
treated with reference only to the condition of the bodies and their relative position, 
and without reference to the surrounding medium.” 


“(2) The mechanical difficulties, however, which are involved in the assumption of 
particles acting at a distance with forces which depend on their velocities are such as 
to prevent me from considering this theory as an ultimate one, though it may have 
been, and may yet be useful to the coordination of phenomena.” 


“(3) The theory I propose may therefore be called a theory of the Electromagnetic 
Field, because it has to do with the space in the neighbourhood of the electric and 
magnetic bodies,and it may be called a Dynamical Theory, because it assumes that in 
that space there is matter in motion, by which the observed electromagnetic 
phenomena are produced.” 


“(4) The electromagnetic field is that part of space which contains and surrounds 
bodies in electric and magnetic conditions. ... It may contain any kind of matter, or we 
may render it empty of all gross matter, as in the case of Geissler’s Tubes and other so- 
called vacua. 


There is always, however, enough matter to receive and transmit the undulations of 
light and heat, and it is because of the transmission of these radiations is not greatly 
altered when transparent bodies of measurable densities are substituted for the so- 
called vacuum, that we are obliged to admit that the undulations are those of aetherial 
substance, and not of the gross matter, the presence of which merely modifies in some 
way the motion of the aether. 


We have therefore some reason to believe, from the phenomena of light and heat, that 
there is an aetherial medium filling space and permeating bodies, capable of being set 
in motion and of transmitting that motion from one part to another, and 
communicating that motion to gross matter so as to heat it and affect it in various 
ways.” 


“One aspect of the mechanical model Maxwell built up to present a 
complete picture of the electromagnetic field was the proposition that space, 
i.e. aether, can sustain stress, and a force is transmitted from one body 
(electrified or magnetized) to another by means of stress, in the same way a 
force is transmitted from one end of a cable to the other by means of tensile 
stress, and from one part of a beam to another by means of shear stress. 

In his two-volume book A treatise on Electricity and Magnetism 
Maxwell presents a complete formulation of the stress in the field (read 
aether) by constructing the Stress Tensor for the Static Electric Field [21] 
and for the Static Magnetic Field, in terms of the field potentials. 

We have derived the stress tensors for electrostatic field, magnetostatic 
field and time varying electromagnetic field in terms of the electric field E, 


magnetic field B in a unified manner exploiting the useful identity given in 
Eq. (6.7). 

Einstein’s formulation of the Special Theory of Relativity saw the 
demise of the Luminiferous (i.e. light carrying) Aether. Light travels in 
empty space, electric and magnetic forces also propagate from one body to 
another (with the speed of light) in empty space. Is there then any place for 
Maxwell’s stress tensor? Is it only for historical reason that we are writing 
this long article? We shall attempt to provide the answer in four steps. 

First, it is indeed an amazing thing that the force acting on an isolated 
body A (which may consist of electric charges and currents), due to the 
presence of charges and currents elsewhere, can be computed exactly by 
drawing a boundary surface s of our convenience surrounding A, as in Fig. 
6.1(a), finding the “stress” all over this surface, and by integrating this 
stress. In other words, there is stress even in vacuum. The purpose of this 
chapter is to articulate how this stress is to be found out. Also it should be 
noted with interest that even empty space is not a true vacuum. When 
loaded with the electric and magnetic fields, space comes under stress. 
Empty space is always buzzing with emission and absorption of virtual 
particles, with the virtual photons mediating the interaction among 
electrified and magnetized objects. Aren’t these virtual photons the new 
avatar of the aether? 

Secondly, calculating the force on an isolated object A requires exact 
knowledge of the E or B field in which A is immersed. In recognizing these 
fields, one has to be very careful that these E, B fields do not contain any 
trace of the fields contributed by A itself. This is sometimes a challenging 
task. Consider for example the force acting on the surface of a conductor 
carrying a surface charge density o, as in Fig. 6.1(b). The electric field just 
outside the surface is E = (0/e9)n where n is a unit normal to the surface. 
One may be tempted to conclude that the force per unit area of the surface 
is F’ = oE = (0°/eo)n, forgetting the fact that an infinitesimal area da on the 
surface contributes the same E field perpendicular to the surface as the rest 
of the surface, so that the true force is 


(a) (b) 


Fig.6.1. Electrified object in E field. 


l 2 2 Io 
F = =F = (of /2¢9)n = (£0 E4 /2)n. (6.1) 


The stress tensor approach, which uses the total field Eşota, making no 


distinction between the test object and the source object, will give the right 
result without creating any confusion, as we shall show following Eq. 
(6.15). 

Thirdly, it is always advisable to arrive at the same answer through 
several alternative routes, if available, just to make sure that we have not 
made any mistakes. The stress tensor provides that valuable alternative 
route. 

And fourthly, Maxwell’s stress tensor, which we shall denote by the 
symbol 7, is needed for understanding conservation and flow of momentum 
in the electromagnetic field, which we shall present in Sec. 6.5. When one 
goes deeper into the theory of relativity the same tensor appears as the most 
important component of the energy-momentum tensor required not only for 
presenting a four dimensional and unified view of the conservation of 
energy and momentum, but also for building up the source term in 
formulating Einstein’s field equation for the gravitational field, in his 
General Theory of Relativity. 


6.2. Maxwell’s Stress Tensor for the Electrostatic Field 


6.2.1. Volume force density in terms of the field 


We shall now construct the stress tensor for the electrostatic field. 


We shall call this tensor Maxwell’s Stress Tensor and represent it by the 
symbol 7“), where the superscript ®© implies electric field. 

Figure 6.2 shows a system of electric charges S placed in an electric 
field E(r). In Fig. 6.2(a), the system consists of discrete charges q1, q2, q3 
placed at the radius vectors rj, ro, r3,.... In Fig. 6.2(b), the system is a 
continuous distribution characterized by a smooth charge density function 
p(r) confined within a volume. Our intention is to write the total electric 
force F on this system. 

The force on the discrete system shown in Fig. 6.2(a) is given as 
follows: 


F = X` gE @"(r;). (6.2) 


Here the sum is over all the charges in the system, and E®™®(r;) is the 
external electric field at the radius vector r; caused by the presence of all 
other charges lying outside the system s. 

For the case of continuous distribution, shown in Fig. 6.2(b), the 
individual charges become infinitesimal elementary charges, i.e. qj > 


p(r)d°r, and the sum becomes the integral 


F = III p(r) EB" (r) dr. (6.3) 
a 


3 "ee aA pe 


a 


Fig. 6.2. Forces on charges in an electric field. 


What about the force from the charges inside the system s. They are 
internal forces, and cancel due to Newton’s third law of motion. 

Let E}™(r;) be the “internal” field caused at r; by a member particle i 
lying within the system s. Then F4 = g;B‘""(r;) is the force that the member 
particle i exerts on the member particle j. By Newton’s third law of motion, 
gE (rj) + gE!" (r;) = 0. Adding together over all pairs for the discrete 
distribution, and integrating over the entire distribution for the continuous 
distribution we get 


N IN N 
For discrete: ) qj ) EE (rj) = X  GgE™ (r;)=0. 
j=1 i=1 j=1 
For continuous: JJ) pír) EY" (r) = 0. 
JISIV 


In the first equation, the sum symbol =’ means that while summing over 
i, the term i = j (corresponding to the “self field” of the member j) is to be 
avoided. The “internal field” EC" (4) is the field at the location of the 
member j caused by “all other members” in the system s. In the second 
equation Er) is the “internal field” at the radius vector r, as sensed by a 
tiny volume element d?r at this point. 

We shall add the null contribution shown in the second line of Eq. (6.4) 
to the right-hand side of Eq. (6.3) and write 


F = Hi p(r)E (r) dr. (6.5) 
JJJv 


Here E(r) is the actual field at the point r, being the sum of two 
contributions, from the (i) external sources, and (ii) the internal sources of 
the system Ss. 

The purpose of adding the null integral of Eq. (6.4b) to Eq. (6.3) is that 
when we write the force density f, the internal forces need to be added. That 
is, 


(6.4) 


f(r) = p(r)E (r) (6.6) 


is the force on unit volume of the charge distribution at r, in which E(r) is 
necessarily the total field at this location, caused by both external and 


internal sources. Now we manipulate the right-hand side of Eq. (6.6) so as 


to convert PE > V:?™, as suggested in Eq. (5.79). This new tensor field 
Fr) would represent “stress” in the electrostatic field. 

Construction of the stress tensor for electrostatic field, magnetostatic 
field and time varying electromagnetic field will be facilitated by the 
following identity [22]: 


V. jaa — za =(V-A)A-Ax(V xA). (6.7) 


Before establishing the above identity we shall need a standard formula 
(see, for example, vector identities compiled in Griffiths, 4th edn). 


V(A-B)=Ax(VxB)+Bx (Vx A) 
+(A-V)B+(B-V)A. (6.8) 


By setting B = A in the above formula, we get 


vV (54°) =Ax(VxA)+(A-V)A. (6.9) 


We shall now prove the identity (6.7). 


Proof. 


p 
V - (AA) = (az) . (e;e;A;A;), 
Ox, : 
ð 
= Bn, A4 Je; 
= Oz; 415 + d ‘Ər; 117 ej 
=(V-A)A+(A-V)A, (a) 


=Ax(VxA)+(A-V)A, by (6.9). (b) 
The identity (6.7) follows when we subtract line (b) from line (a). 
O 
Note that we have used Einstein’s summation convention introduced on 
p. 149. That is, rgb = Di egf evejAiAy = Dia Dj ei8;AiAj, ete. 


The stress tensor for the electrostatic field follows when we set E for A 
in (6.7), and use the field equations: V- E = p/£9; V x E = 0: 


t = p= V-F, (a) 
(6.10) 


where go = £)/EE — +E 1). (b) 


It will be a simple exercise to write the Cartesian components of this 
tensor: 


ale) ale) ale) ale) 


=(T -ez ‘e&y T ez) 
EA E, Ez (E2 — E? — E?) EYE, 
E,E, E,E, 1(E2 — E? — E?) 


(6.11) 


6.2.2. Example 1: Stress vector on a plane as a function of the 
angle of inclination 


The stress tensor (6.10) will remain abstract and obscure unless the reader 
works out a few examples. We shall provide two examples of which the 
first one is depicted in Fig. 6.3. A uniform electric field E = Ee, exists in a 


certain region of space. The stress tensor is then given by the following 
expression: 


ale) £p 


"z 


2 \ £0 à 2 ‘a > 46 
ae (e,e, — eye, — e,e,) = > 0 —E 0 . (6.12) 


p 


Imagine a plane running parallel to the Z-axis, but inclined to the X-axis by 
an angle 0 (Fig. 6.3(a)). The normal vector is then given as 


sin 
n = e; Sin + e, cos = | cos@ |. (6.13) 
0 
The stress vector 7 on this plane is then 
p 
sin 
T” = T n= SE (e, sin — e, cos#) = 2p? —cos |. (6.14) 


“ & 


0 


Fig. 6.3. Stress vector on an inclined plane placed in a uniform electric field. 


Let us consider some special cases: 


2 EN Ws , FER jn es 
T= 5 E*e, (by setting ð = 7/2), (6.15a) 

, E0 mi P to ' 
TY = -= E?e, (by setting ð = 0), (6.15b) 

i £p i ’ ~le) ; PPR A 

JT = -= E’e, (same as  -e,), (6.15c) 
(45° ) =0 52 l j \ a ~ \ 
T E = pE —p(ez — @,). (6.15d) 

7 5 ? 


Equations (6.15a)-(6.15c) give the stress vectors on the planes identified 
by the normal vectors ex, ey, e,, and Eq. (6.15d) gives the stress vector on a 
plane making an angle of 45° with X-axis. We have illustrated these points 
in Figs. 6.3(b) and 6.3(c). We have shown the stress vectors with thick 
arrows, and labelled them with the bold Greek letter 7. We draw the 
following conclusion. 


Conclusions: 


(a) If the field is perpendicular to the plane, the stress vector is normal and 
outward (tensile stress), and equal to . 

(b) If the field is tangential to the plane, the stress vector is normal and 
inward (compressive stress), and equal to 

(c) If the field makes angle 45° to the plane, the stress vector is tangential 
(shear stress), and equal to 


Case (a) applies to a conductor in an electric field E. The field is 
perpendicular to the surface. The surface force density is the same as the 
stress vector. We get back the same answer as in Eq. (6.1) using the stress 
tensor, without laboring to find out what is the “external field”. 


6.2.3. Example 2: Force transmitted between two charged 
particles across a spherical boundary 


We shall obtain the familiar Coulomb force between two charged particles 
using Maxwell’s stress tensor. 


We shall use spherical coordinate system. The reader must have used 
the spherical coordinate system to construct vectors in situations of 
spherical symmetry, as in the case of central forces in mechanics. This 
example, and two more the next sections, will give an opportunity to deploy 
the same coordinate system to construct a tensor, the stress tensor to be 
specific, to get at the answer with less expenditure of time. 

We shall first obtain an expression for the E field at any arbitrary point 
P(r, 8, @) located on the spherical surface £. The point P is at the 
displacement vector n from A and r from O (Fig. 6.4(a)). In order to avoid 
repeated appearance of the constant = we shall set E = x£. Note that 


n =r — a =r — aez, (6.16a) 
so that n? =r? +a? — 2ra cos9, (6.16b) 
and e, = cos ĝe, — sin feg. (6.16c) 
Then 
Qr an 
a> a. Se (6.17a) 
r3 n3 
Qe, q(r — ae, ) 
= + -sr n. (6.17b) 
p2 ( r2 + a? — 2ra cos @)3/2 ! 
Therefore, 
E€ = Ee, + EE, ( 6.18a ) 
n Q gir — acos f) , , 
where oe => -+ SS) ( 6.18b) 
ps (r4 + at — 2ra cos ĝ)3/2 
E qasin (6.180) 
n0 5 [me , (0.15C]) 
i (r? +a? — 2ra cos @)3/2 


From Eq. (6.10), the stress tensor is 


~le) , 1 2N 1 , 1 2N i ~le) 
T ` = &(EE — zE 1) = mae —_ zE 1) = Term 


(6.19) 
~le 1 „IA 
where T =€€&— sel, 


$ 


which we may refer to as the “reduced stress tensor”. 


Fig. 6.4 Electric stress vector on a spherical surface. 


Since we have invoked the spherical coordinate system to write the 
expression for the £ field, the components of the tensor 7 will have to be 


written in this coordinate system. Since only r and 8 components of € are 
non-zero, the non-zero components of this tensor are 7,,;, Trg, Ter, Tog, as 
seen from (6.19). Therefore €? = €? + €; and we write this tensor as 


T =| To Too O|, where 


s24 (6.20) 


2\ 
a P 
r/ 


The first column in the square matrix on the left represents the stress 
vector 7; on the spherical surface 4 (corresponding to n = e,, analogous to 


the first column in Eq. (5.62). Using the expressions for €+, Eọ given in 
(6.18) we shall work out the components of T, explicitly as follows: 


T, =@rTrr + e6Tor, 


Wl — 


— @?[(r — a cos 8} — (asin 0)?] 
ra (r2 + a? — 2ra cos@)$ 
2Qq(r — cos 8) 
oie Qqasinð 
Tor = E,€9 = r2(r2 + a2 — 2ra cos@)3/2 
_fasin G(r — acosd) saiu — acot (6.21) 
(r? + a* — 2ra cos)’ 

The first component T, is the normal stress on the surface £ and the 
second one 7 9, the tangential (or, the shear) stress. 

In order to illustrate the above equations, and to see how the electric 
field vector E and the Maxwell’s stress vector 7; vary on the surface of the 
imaginary sphere Ł, we shall make a numerical example, setting Q = 2, q = 
-1, a = 3, r = 1 in Eqs. (6.18) and (6.21). The expressions we now get are 


functions of the polar angle 0 only. We have plotted Tr, Tor in Fig. 6.4(b), 


using Maxima. 

In order to show how the field vector ¢ and the stress vector T, vary on 
the surface of the sphere } we have prepared Table 3.1 after evaluating the 
corresponding quantities in the columns 1-9, using Maxima. The angles @p, 
$r appearing in columns 5 and 9 have been explained in Fig. 6.4(c). 


Table 3.1. £ and 7 vectors on the surface of the sphere. 


1 2 3 4 5 6 7 8 9 
ð Er Ee E OE Trr Tar Tr OT 
0° 2.25 0 2.25 0° 2.53 0 2.53 0° 
30° 2.15 —0.14 2.16 —3.8° 2.30 —0.31 2.33 —7.6° 
60° 2.03 —0.14 2.03 —4° 2.04 —0.28 2.06 -—7.9° 
90° 1.97 —0.10 197 —2.8° 1.93 —0.19 194 -—5.59° 
120° 1.95 —0.05 1.95 —1.6° 1.89 —0.11 1.90 —3.3° 
150° 1.94 —0.02 1.94 —0.8° 1.88 —0.05 1.88 -—1.5° 
180° 1.94 0 1.94 0° 1.88 0 1.88 0° 


The first one is the angle between the normal e, to the surface © and the 
electric field £ at the surface, and the second one is the angle between e, and 
the stress vector 7, on the surface. 


E=,/€27+E?, tang, = 
j (6.22) 
f eges < r 
T. =, TZ +T2; tand, = —. 
T y rT? Gr T Ta 


We have drawn the field vectors E and the stress vectors 7, on the sphere È 
in Figs. 6.4(d) and 6.4(e) (using two different scales for the two sets of 
vectors). 

All this tedious work will have been fruitful if we could show that the 
surface force density, when integrated over the entire surface }Ł, will give us 
back the familiar Coulomb force between the two charges. The surface 
force density is the same as the stress vector on this surface. We shall work 
with the “reduced” surface force density, same as T,. 


The Coulomb force of attraction (if Q, q are of opposite signs) or 
repulsion (if they are of the same sign) will be along the line OA joining the 
two charges. Since this line coincides with the Z-axis, we shall integrate the 
Z component of 7,, which we shall denote as f,. We go back to Eqs. (6.16) 
and (6.21) to compute this force, and get the following results after some 
simplification: 


a = @; ' Wr 
= (cos ĝe, — sin ĝeg) - (e, Tr + egTor) (6.23a) 
= cos OTi- — sin ATa,- (6.23b) 
= f: (Q?) + f.(Qq) + felg), where (6.23c) 
= n2 1 Q? 
f.(Q*) = —— cos @, (6.23d) 
= 2 r4 


Qa|r ce s — a] 


-232 nd a Maja)’ (6.23e) 
r2(r2 + a? — 2ra cos@)3/2 


f- (Qa) 


F (q?) 1 @?[(r? + a?) cos 8 — 2ra] (6.23f 
_(q*) = — [Ála aaassltlliÃiÃiÃÂÃiħiň (0.2. 
Izv } 2 (r? +a? —2ra cos@)? j 


The expressions in lines (6.23d) and (6.23f), involving Q? and q°, are 
“self-terms”, whereas the expression in (6.23e) involving Qq is the 
“interaction term”. The reader should complete the steps leading from 
(6.23b) to these equations. We shall soon show that the self-terms will 
vanish upon integration, leaving the integrated stress force entirely a 
function of Qq. 

The “reduced” force transmitted across the surface Ł, and hence acting 
on the charge Q, is the surface integral of f,. Let us denote this integral as F. 


An area element on ¥ is da = r° sin 0d6d@. Therefore, 


F = Jj f. r? sin ð dé dé 


Pr 
= 2nr? | f.sin@ dé (6.244) 
= 2nr?[Z(Q?) + Z(Qq) + Ila), (6.24b) 
pr 
where Z(Q?) = / f.(Q?) sin@ dé = 0, (6.24c) 
JU 
Pr i l i 2Qq f 
T(Qq) = | f.(Qq)sin@ dé = —-— (6.24d) 
0 atr“ 
T(q*) = | f(g) sin@ dé = 0. (6.24e) 
0 
~ ATQ : 
Hence, F =-= Ag (6.24f) 


a? 

The integral given in (6.24c) is easy to evaluate. The other integrals 
have been worked out in the Sec. B.1. They can be worked out more easily 
using Maxima with a computer. 

To get the true force we go back to (6.19), multiply f with the factor 
t=, and get the force Fg acting on the charge Q: 


Fo = —~— Fe, = —-—Se:. (6.25) 


This force is the familiar Coulomb force on the charge Q located at the 
origin, exerted on it by another charge q located at a distance a on the 
positive Z-axis. It is repulsive, i.e. towards the negative Z-axis, if Qq is 
positive, and attractive, i.e. towards the positive Z-axis, if Qq is negative. 


6.3. Maxwell’s Stress Tensor for the Magnetostatic Field 


This section is the magnetostatic analogue of the electrostatic stress tensor 
presented in Sec. 6.2.3. The steps are parallel, so that we shall avoid 
detailed explanation. 


6.3.1. Volume force density in terms of the field 


We shall construct Maxwell’s stress tensor for the magnetostatic field, 
represent it by the symbol 7“). The volume force density in a magnetic 
field is f™ = J x B. Therefore, we need to construct the tensor 7™) under 
the specification 


(m) 


V - T = f™ =J x B. (6.26) 


This is now an easy task, thanks to the identity (6.7) we had established 
in Sec. 6.2. We set B for A in that equation, and use the field equations: V - 
B = 0; V x B = pọJ, leading to: 


(m) 


fm)=JxB=V.7" , (a) 


(6.27) 
— iim) ` i 


where T = 4 [BB — +B25]| . (b) 


Note the similarity between the stress tensor 7™ written above and the 


stress tensor 7“) written in Eq. (6.10). The former converts into the latter if 
we replace E with B and €9 with =. In the same way the matrix form given 
in Eq. (6.11) converts to the matrix form of 7. Consequently, the stress 
vector changes from normal outward, to tangential, to normal inward, as the 
angle between the plane and the direction of the B field changes from 90° to 
45° to 0°, as shown in (6.15) and illustrated in Fig. 6.3, and the 
“Conclusion” written on p. 175 carries over to the case of a magnetic field 
without any change. Each point in the conclusion is well illustrated in Fig. 
6.5 (see next section) if the reader compares the direction of the field vector 
B in Fig. 6.5(d) with the direction of stress vector 7; in Fig. 6.5(e). 


Fig. 6.5. Magnetic stress vector on a spherical surface. 


6.3.2. Example 3: Force transmitted between two magnetic 
dipoles across a spherical boundary 


The smallest denomination of the source of a magnetic field is a magnetic 
dipole, consisting of a tiny current loop. We shall therefore think of the 


force between two magnetic dipoles. We have placed these dipoles along 
the Z-axis, oriented them in the positive direction of this axis. Figure 6.5(a) 
shows the geometry of this configuration. The dipoles are shown by tiny 
spherical blobs with an arrow pointing in the direction of this vector. As in 
the electrostatic example, we shall illustrate Maxwell’s stress tensor gm) by 
finding the stress vector on the surface of an imaginary sphere = of radius r 
surrounding the point magnetic dipole M which is placed at a distance a 
from the other point magnetic dipole m such that r < a, and then integrate 
this stress vector over the spherical surface to obtain the force Fm on M 
exerted by m. 

We shall first obtain the B field at any arbitrary point P(r, 0, $) located 
on the spherical surface Ł, at the displacement vector n from A and r from 
O. In order to avoid repeated appearance of the constant #2, we shall set 
B= 4B, and use Eq. (6.16). 

Let BD (r,6,¢), and B™(r,8,ġ) be the fields? produced by the dipoles M 
and m respectively, at any coordinate point (r, 6, $). Adding them we get 
the total field g(r, 0, $): 


B(r, 6,0) = B) (r, 6,6) + B™ (r, 0, 0), (6.28a) 
S 3(M-r)r— Mr? ras i 
B™ ) (7.4.0) = —————— - = pM le, + BLM 'ep, ( 6.28b ) 
p? 
2M cosé f \ M sing - 
where B™) =e, By 1) = - : (6.2S8c) 
p3 é r3 
(m) y \ 3( m - n ) 1) = my? (m) (m) fe 9707) 
B™ (r, 9, ¢) = +. = Bme, + BE” en, (6.28d) 
në 
9 9 DA 
' m|2(r4 + af) cos — (3 + cost ĝar i 
where Bi”) = tl a i all Mani - l (6.28e) 
n° 
9 +9 A P , 
, miré — 2a* + arcos@)sin@ npe 
By” oOo n O (6.28f) 


n? 


For future convenience, we write 


B=B,e,+ Boeg. where 


mi. (6.29) 
y+ = 10. 
i 


From (6.28): œ =2cosð, 8=2( r? +a \cosĝ — (3 + cos? far, 


7 m , 9 . 2 7 n i . 
y=sinf@, ð= (r^ —2a°+arcos@)sing. 


From Eq. (6.27), the stress tensor is 


aim) = i (BB E T _ -Bo (ss — =61) 
/4o 
Ho m) ) 
= i T (6.30a) 
1674 


(m) 


7 l 27 . 
where T =BB- 281, (6.30b) 


which we may refer to as the “reduced stress tensor”. The non-zero 
components of this tensor needed by us are 


Ter = B? — = B? `- 1p? — B? ); Tro = Ter = B, Bo. (6.31) 

In order to illustrate the above equations, and to see how the magnetic 
field vector B and the Maxwell’s stress vector T™ look like on the surface 
of the imaginary sphere surrounding the charge Q, we shall make a 
numerical example, setting M = 2, m = 1, a = 3, r = 1 in Eqs. (6.29) and 
(6.31). For this purpose, we have prepared Table 3.2, after evaluating the 
corresponding quantities in the columns 1-9 using Maxima. The angles ¢p, 
dy appearing in this table have been explained in Fig. 6.5(c). See also Eq. 
(6.21). 

We have plotted 7,,, Tg, as functions of the polar angle @ in Figs. 6.5(b), 
using Maxima, and have drawn the vectors B and 7, on the sphere = in 
Figs. 6.5(d) and 6.5(e) (using two different scales for the two sets of 
vectors). 


All this tedious work will have been fruitful if we could show that the 
surface force density, when integrated over the entire surface X, will yield 
the same force between the two dipoles that we can calculate using the 
standard formulas of magnetostatics. Let us then first apply the “standard 
formula”. 


Table 3.2. B and 7 ; vectors on the surface of the sphere. 


1 2 3 4 5 6 7 8 9 

9 Br Bo B op Tes Tor F or 
0° 4.25 0 4,25 0° 9.03 0 9.03 0° 
30° 3.58 0.86 3.69 13.5° 6.05 3.09 6.80 26.9° 
60° 2.00 1.64 2.59 39.3° 0.66 3.28 3.34 78.5° 
90° —0.03 1.96 1.96 —89.4° —1.91 —0.06 1.91 1.79 
120° -2.03 1.71 2.66 —40.1° 0.60 —3.48 3.53 —T76.8° 
150° —3.5 0.99 3.63 —16.0° 5.62 —3.47 6.60 —31.5° 
180° —4.03 0 4.03 ge 8.13 0 8.13 o° 


The force Fp on m is given by the formula F = (m - V)B, in which B is 
the field created by M. The m vector is in the Z-direction. Therefore, 


m-¥Y = m, which means that we can treat the (x, y) coordinates as constant 
and equal to zero. Therefore, 


IB 
r= m— 
O: - 
sz=y=0 zza 
oM [322 — 2? 
where, B(0,0,2)= a | e., 
i (6.32) 
ƏB 3uoM 1 
= = ak 
a zc=y=0,z=a A e 
3uomM 
Hence, Fn = B ablaa A 
2ra4 
By Newton’s third law of motion, 
3 oMn 
Fin = -Fn = Ce. (6.33) 


27a 


Now we shall calculate the same force using the stress tensor. The 
surface force density is the same as the stress vector on this surface. We 
shall work with the “reduced” surface force density, same as T+. 

The force of attraction between the dipoles will be along the line OA 
joining them, which lies on the Z-axis. Therefore, we need the Z component 
of the surface force density /.: 


f: = ez: Ts = (cos ĝe, — sin eg) - (€r Ter + eg Tor ) = cos OT — sin Tor. 
(6.34) 


We shall break up this force density into three components: (1) /,(M?) 
representing self-term for M, (2) /,(Mm) representing interaction term 
between M and m, (3) /,(m?) representing self-term for m. From Eqs. 
(6.29), (6.31) and (6.34): 


7 ; on M? 
fz (M?) = [(a? —4 g cos — 2an sin 6] 7s" (6.35a) 
= : m ` . : Mm ; 
f-(Mm) = (aß — yò) cos — (að + bv) sin 6] -3E (6.35b) 
r m” 
9 
P 2 if 49? oD) 7 m” . an A 
falm ) = [(8? = ô“) cos @ — 250d sin 8] — (6.35c) 


2710" 


The “reduced” force F transmitted across the surface x, and hence 
acting on the dipole M, is the surface integral of f., which is the sum of the 
integrals of ,(M*), (Mm), and f.(m°). Each integral is difficult to evaluate, 
because a, P, y, 6 are complicated functions of r, a, 0. We have evaluated 
these integrals using Maxima. See Appendix B. The result is as follows: 


F = /| f.r? sinf dô do = 27r? | f.sin@ dé 
= 2nr? |Z ( M?)4+2Z(Mm) +T m?)], 
where Z(M?) = / i. (M7) sin @ dé = 0. 
JU 
(6.36) 


12M m 


atr? : 


T(Mm) = | $ (Mm) sin 8 dé = 


J0 
T(m?) = | TA (m?) sin 6 dé = 0. 
0 


24r Mm 


Hence, F = z 


a 


Because of the relation (6.30) the true force Fp acting on the dipole M 
is #& times the force F. Hence 


F = ala ak 


a 


Se (6.37) 
We have thus verified that the stress tensor has given us the same force that 
we obtained in Eq. (6.33) using standard formulas of magnetostatics. 

We have worked out three examples to bring out the meaning of 
Maxwell’s stress tensor for electric and magnetic fields. The reader may 
wonder why we should go through such a tortuous road to get answers that 
can be easily obtained using simpler formulas of electrostatics and 
magnetostatics? Isn’t it like demolishing a mud wall with a cannon? 

Every cannon needs a mud wall to ensure its trust-worthiness before 
deployment in a true situation. Maxwell’s stress tensor is destined to play a 
bigger role, in constructing the conservation equation for field momentum, 
and later under the watchful eye of Special Relativity, in building up the 
covariant expression for conservation of energy and momentum. The three 
examples we have worked out were intended to be an intellectual exercise 
to instill confidence in the mathematical expressions of 7 and 7™ before 
crowning them for their majestic role. 

Our next example is not a mud wall. It shows how Maxwell’s stress 
tensor can solve a difficult problem directly. 


6.4. Example 4: The Force Between Two Hemispheres of a 
Charged Sphere 


Consider a uniformly charged sphere of radius R, and carrying a total 
charge Q. What is the (repulsive) force that the lower hemisphere exerts on 
the upper hemisphere. 

Finding the force by a naive application of Coulomb’s law can be 
difficult. 

The solution of this problem can be found in Griffiths.“ However, 
Griffiths employs Cartesian system. We shall use the spherical coordinate 
system to obtain the result compactly. We have illustrated the geometry in 
Fig. 6.6. 

We divide the boundary surface into two parts: (1) the upper surface 
Stop, on which the normal vector is e,, (2) the lower surface Spottom on which 
the normal vector is —e, = eg. The E-field is radial on both. We shall find the 


stress vectors 1, 7), and their normal components ;"),;{") on both 
surfaces, and by integrating, shall get the answer. Let us first get the stress 
tensors on the top and the bottom surfaces. The (reduced) electric fields are 
as follows: 


EY) = Se: ED- = ep. (6.38) 
m 


Fig. 6.6. Charged hemisphere. 


Using Eq. (6.19), the (reduced) stress tensor takes the following forms: 


o2 (1 9 0 02,2 1 0O 0 
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Note that the normal to the top surface is n = e,, and the normal to the 
bottom surface is n = —e, = eg. Since the net force is in the z-direction, we 
shall consider only the (z, r)-component of the stress vector on the top 
surface, and the (z, 8)-component on the bottom surface 
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(b) ~{b) ~ (b) ~ (b) Q? 
Tg =€: T eg =—ee T -e9=-To =: 


It is seen from the last equation that the stress vector is pointing into the 
volume above the surface, though the field € is parallel to the surface. This 
may appear strange on first sight, but conforms to Eq. (6.15), and the 
conclusions following them. 

Integrating the stress vectors given in Eq. (6.40) over the respective 
surfaces, we get the force on the upper hemisphere. 
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Adding the above two forces and multiplying with z (see Eq. (6.19)). 
we get the total force: 
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(6.42) 


6.5. Maxwell’s Stress Tensor for the Electromagnetic Field and 
Momentum Conservation 


We had introduced Maxwell’s stress tensor for static electric and static 
magnetic fields, with suitable applications, in Secs. 6.2 and 6.3. These 
applications demonstrated that the force acting on static distributions of 
electric charges and currents lying within a bounded volume y is equal to 
the stress vector integrated over the surface s bounding this volume. The 
attribute “static” implied that the objects considered in our discussion, e.g. 
isolated charges and isolated current carrying loops, were fixed with a kind 
of “glue” making them immobile in spite of the electric and magnetic forces 
acting on them. We shall now remove that glue and see what role can now 
be played by the same stress tensors. 

At this point, we shall make a subtle distinction between force and 
stress. Force acts on material objects which may be discrete charged 
particles or a localized continuous material media, e.g. a plasma. The stress 
considered here acts on the field, which is a kind of ethereal medium, as 
conceived by Maxwell and his contemporary physicists. In the absence of 
any glue holding them, the charges (e.g. electrons, nuclei) and currents (e.g. 
current loops) will be free to move and gain momentum. However, the 
momentum need not be confined to material objects. It can be shared by the 
field as well. Therefore, we shall make the following conjecture. 


Conjecture 6.1. There exists a Maxwell’s stress tensor T®™ for the 
electromagnetic field, and it is given as 
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such that 
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where g and P are, respectively, the field momentum density and the 
material momentum density, the latter being governed by Newton- 
Minkowski-Lorentz-force equation 


AP 
ST = pE +Jx B. (6.45) 
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Moral: It is seen from (6.44) that the force transmitted by the Maxwell 
stress tensor T(em) across a closed surface S contributes to the total 
momentum inside the volume V (bounded by the same surface), and has two 
parts, namely, the mechanical part and the field part. 

In Appendix A.3, we have shown all the 3 x 3 components of 7&™ 
explicitly. 

The right-hand side of Eq. (6.44) gives the stress transmitted across the 
boundary s. The right-hand side of Eq. (6.45) gives the density of Lorentz 
force acting on all charged matter lying within the volume y. We shall 
convert the surface integral on the right-hand side of (6.44) into a volume 
integral, using Gauss’s theorem (see Sec. 5.2.4) so that each term in this 
equation is a volume integral, and then remove the integral sign reducing 
the same equation to an equality among three density functions: 


=T.9 (6.46a) 


jm) 


Aq ~(e) = 
or Z +pE+IxB=V-T oo. a mee (6.46b) 


We shall now show that the above conjecture is right, that starting from 
Maxwell’s equations we are able to find an expression for the field 
momentum density such that the momentum conservation of matter and 
field together falls into the scheme suggested in Eq. (6.46). Our task is 
made simple by the identity (6.7) we had established in Sec. 6.2. We shall 
do the work in two stages: (1) set E for A in (6.7), and use Maxwell’s 
equations: V-E=p/o; Vx E=—48, (2) set B for A and use Maxwell’s 
equations: V -B =0; V x B= po(J + 042). Hence, 


V.P =V. [ee - seed = £ [(V -E)E—-Ex (V x E)] 
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Equation (6.47c) is obtained by adding Eqs. (6.47a) and (6.47b), and 


using definition of 7™ as given in (6.43). It confirms validity of our 
conjecture and identifies the field momentum density as 


(6.48 ) 


We shall like to recast Eq. (6.46a) into the general format of the 
conservation equation 


ð m f 
va volume density) + V - (flux density) = 0. (6.49) 
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In this case, the momentum flux density é‘""” is to be identified as 


—(em) alem) , 
= — . (6.50) 


Equation (6.46a) now reads like a true momentum conservation equation: 


2) ~ (em) 
z0 t+tP)+V-ŝ =0. (6.51) 
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It may be easier to comprehend the meaning of the above conservation 
equation by writing its three Cartesian components. For example, the x- 
component of the above equation will be 
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where ©, = ĝ. a, =— €z (6.52b) 
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The first two terms in Eq. (6.52a) give the rate of increase of the x- 
component of total momentum (consisting of field momentum and material 
momentum) per unit volume, the third term gives the rate of outflux of the 
x-component of the field momentum per unit volume. Conservation of 
momentum implies that the sum of the two must be zero. 

Before leaving this topic let us recall the expressions for the field energy 
density w and the field energy flux density S (i.e. the Poynting’s vector, Sec. 
12.3) 


[E? + eB?) Field Energy Density, (6.53a) 


S = coc[E x cB] Field Energy Flux Density. (6.53b) 


It is immediately noticed that 


S = œg. 


(6.54) 


When the electromagnetic field is a radiation field, E = cB and E x cB = 


En where n is the direction of the Poynting’s vector, giving the direction of 
the flow of radiation energy. For such radiation fields, 
> _ l 


a 
w=e9k*, S=cwn, g=—n, w=cg. (6.55) 
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The last equality is a reminder of the relation E = cp between the energy E 
and the momentum p of a photon. 

We are still not too clear about the true meaning of the momentum flux 
density ®. To get familiarity with it let us consider a plane electromagnetic 
wave propagating in the x-direction, polarized in the y-direction. For such 
afield E = Eey, cB = Eez. It is a simple exercise to evaluate ĝ by setting Ex = 


0, Ey = E, E, = 0; cB, = 0, cBy = 0, cB, = E in the expression for ®, in Eq. 
(6.52c) and similar expressions for ®,, ®, and obtain 


$ = e, + P,e, + Be, = | 29 E7e,)e, = cge, = ge. (6.56) 


Here c = ce, represents the “velocity” of light, being the speed c multiplied 
with a unit vector in the direction of propagation. If we now consider a 
plane perpendicular to the X-axis, so that n = e,, then the outflux of field 
momentum per unit area across the plane will be @ - n = @- e = cg. 
Generalization of Eq. (6.56) is obvious. If there is a source of radiation 
at the origin (say, an antenna, or an accelerating charged particle), then far 
away from the origin, the momentum flux density tensor ĝ has the form 


® = cge,e, = gce, = gc, (6.57) 


where e, is the unit vector in the radial direction, also identified with the 
direction of propagation of the electromagnetic wave. The tensor § gives 
the measure of how much momentum is crossing a spherical surface per 
unit area per unit time. The momentum density is g = @e,, and it is 
propagating in the radial direction with velocity c = ce,. 


“The Cartesian coordinate system is associated with his name 
DSee [14, Eq. (3.89)]. 
“See [14, p. 368]. 


Part II] 


Physics in Four Dimensions 


Chapter 7 


Space-Time and Its Inhabitants 


7.1. World Line in Space-Time 


It was noticed from the Lorentz transformation formulas derived and 
written in Secs. 3.1 and 3.2 that the space coordinates and time coordinates 
of any event @ in the frame S' is a linear combination of the space and time 
coordinates in S, and vice versa. Therefore, in a certain sense, the borderline 
between the space coordinates on the one hand and the time coordinate on 
the other, looks blurred in Relativity. We fancy therefore a four-dimensional 
world of events where the time coordinate takes an equal status along with 
the three space coordinates. This composite world, integrating time with 
space, is called Space—Time.* 

Space-time needs four axes, namely, the time axis cT, and the space 
axes X, Y, Z, and four coordinates (ct, x, y, z). It will be convenient to label 
the coordinate axes as X“, and write the four coordinates of an event as (x"), 
with p = 0 for the time axis/coordinate ct, and p = 1, 2, 3 for the space 
axes/coordinates x, y, z, respectively. That is, (x? = ct, x! = x, x? = y, xX? =z). 
We shall follow this convention in this book. Also, we shall use Greek 
indices, e.g. u, v, a, P to mean all the four coordinates, and Roman indices, 
e.g. i, j, k to mean only the three space coordinates. For example, we may 
write (x) = (x; x‘) = (ct, x, y, Z). 

Note that we are using the coordinate index p as a superscript, i.e. aS a 
contravariant index. The reader will understand the reason in Sec. 7.8, 
where we shall make a distinction between a contravariant 4-vector and a 
covariant 4-vector. 


(b) 


Fig. 7.1. Events A, B, C on space-time diagram: (a) Two space axes X, Y shown; (b) single space 
axis X shown. 


In Fig. 7.1, we have presented a view of the space-time, which we shall 
refer to aS a space-time diagram (or ST diagram), and displayed three 
events along with their coordinates: A=(r°, 2! 2? 23), 
B= (xp, xi, xp, £p), C = (x2, v1,22,22). In Fig. 7.1(a), we have suppressed the Z- 
axis, so that we can show events on paper, and marked the coordinates on 
the respective axes. In Fig. 7.1(b), we have made the ST diagram simpler by 
suppressing both Y-and Z-axes, exposing only the X-axis and the time axis. 

The above mental construct of the four-dimensional world will not 
diminish the role of the familiar world of pure space dimensions X, Y, Z. It 
is in this space that we see objects like satellites, planets, locomotives. We 
shall call this space the physical space. 

In Fig. 7.2(a), we have presented another view of space-time for 
describing the motion of a particle which is confined to move only on the 
XY-plane. ¢ is its physical trajectory, and P is one point on it. 

On the other hand, the trajectory of the particle in space-time, shown as 
Q, is called the world line of the particle. The event that “the particle has 
reached P” is presented by the point ©p on the world line Q, and is called a 
world point of the particle. 

A particle that does not move at all in a given frame of reference S, still 
moves continuously along its world line, as depicted by the straight line £, 
directed upwards, because the time clock is continuously ticking. The world 
line of a photon (i.e. a light quantum often represented by the symbol y) 
propagating along the X?-axis must make an angle of 45° with the X°-axis, 


as illustrated by the straight line T. In fact, one can construct, at any event 
Op, a light cone Ap whose surface will make the angle of 45° with the xX 
axis. 
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Fig. 7.2. World line and light cones. 


Figure 7.2(b) presents a better picture of the light cone and its 
significance. At any event point ©p the light cone carves out a region of 
space-time as Future, and another as Past, both of them confined within the 
light cone, the Future occupying the upper part, the Past occupying the 
lower one. 

Standing at the event @p I am entitled to have information of events that 
occurred only inside the “Past” segment of the light cone, everything 
outside remaining beyond my knowledge. In the same way, all future events 
which will originate from Op, i.e. whose world lines will pass through @p 
will lie within the “Future” segment of the light cone. The reason for this 
conclusion is that all information/knowledge is received/gathered through 
messengers that move with velocities that would never exceed the speed of 
light c. The fastest messenger is light, or radio signal, moving with the 
speed c. 

The world line of a photon, marked T, having @p as a world point, must 
be grazing the light cone, passing through two points A and B, lying on the 
Past segment and on the Future segment, respectively. Similarly, the world 


line of a material particle, marked Q — progressing from the past to the 
future through the event ©p — must lie inside the light cone, as illustrated 
in the figure. 


7.2. Hyperbolic World Line of a Particle Moving Under a 
Constant Force 


A relativistic particle moving under a constant force undergoes a constant 
acceleration a with respect to its instantaneous rest frame, as we had found 
out on p. 97. If the force is F, and the rest mass of the particle is mo, then a 
= F/mo. 

Consider a constant force F acting on a particle in the X-direction. A 
good example can be a charged particle placed in a uniform electric field E 
= Egey. Let us assume that at t = 0 this particle is instantaneously at rest, and 
located at the origin O, in a certain frame S. Then the (x, ct) coordinates of 
this particle are given in this frame, as functions of the proper time T, as (see 
Eqs. (4.104) and (4.105)) 
aT 


r= all [cosh — — i - d= sil [sinh — , (7.1) 
; a 
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The above parametric equation of the world line transforms into the 
familiar equation of a hyperbola, involving only the space and time 
coordinates: 


P 
9 6 4\2 9 Cc f ter a 
(x + p)* — (ect) = p*, where p = — = unit length. (7.2) 
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We thus get a hyperbolic world line. 

We have used Gnuplot to plot the hyperbolic world line, represented by 
Eq. (7.2) in Fig. 7.3, in which the X°- and the X!-axes are each graduated in 
the scale of p = 1 unit. 

We have highlighted a few important features of the world line in Figs. 
7.3(a) and 7.3(b). The physical trajectory of the particle is +% > AOB > 
+oo, i.e. a directed straight line merging with the X‘-axis, reversing its 
direction at O. The particle comes from infinity with velocity * —ce,, along 
the X'-axis but in the negative direction, decelerates due to application of 


the force in the +e,-direction, stops momentarily at the origin O, then turns 
back and returns to infinity with velocity ~ +ce, along the same X!-axis, but 
now in the positive direction. A, O and B are three points on this axis 


reached by the particle at certain times during the inward and outward and 
journey. The corresponding events are @,, Oo and Op. 


In Fig. 7.3(a), we have drawn a single light cone at the event Qo, and in 
Fig. 7.3(b) at each of the three events ©,, ©ọo and zg. Note that as the 


velocity of the particle approaches c, its world line almost grazes the light 
cone, but never goes outside it. 


» 


Light Cone —Future 


Light Cone — Past 


(a) (b) 
Fig. 7.3. World line of a charged particle under a uniform electric field. 


We shall top up the above exercise with some realistic numerical 
estimates. Let us first define a characteristic time Tọ, such that the particle 
would reach the velocity c in this time, if non-relativistic mechanics had 
been applicable. That is, 


ar,=c, OFT, =—c/a. (7.3a) 
9 
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Hence, p= — =CTo (7.3b) 
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We shall consider a charged particle, e.g. an electron, which is 
accelerated in a 30 m long linear accelerator (e.g. pelletron) to 30 MeV. The 
electric field through which the particle is accelerated is assumed to be 
uniform, and equal to E = 10° V/m. The charge and mass of the electron are 
e = 1.6 x 109 C, mọ = 9.11 x x 10-3! kg. The acceleration is then 


eE 1.6 x 10719 x 108 


ya R , 9 
aac ain 0.17 x 10!8m /s? 
mo 9.11 x x10- ' 


a = 


BA y = 3 = —10 (7.4) 
so that Tə = (3 x 10°)/(0.17 x 10!) = 17.6 x 107s. i 


Hence, p= 17.6 x 1071 x 3 x 10ë = 0.528 m. 


We go back to Sec. 4.4, copy formulas (4.58) and (4.59) which give the 
velocity and displacement the particle: 


C 
v = c5 = 
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and make the following estimates: 


- 1 
at t = To = 3.416 x 10-" a, v = —= c = 0.707 c, 
v2 
r= (y2 — 1)p = 0.414 p; 


j 
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at t = 27, = 6.832 x 1071s, v=4/- c= 0.89c, E 
V5 (7.6) 
r = ( V5 — 1)p = 1.236 p; 
a 
/9 


at t = 37, = 10.248 x 1071! s, v= y igi 0.95 c, 


x = (V10 — 1)p = 2.162 p. 


It is then seen that at t = 3t) = 10.248 x 107! s. the particle has traversed 
2.162 units of distance = 1.41 m from the origin, and has gained a speed of 
0.95 c. We have marked this point on the X!-axis with an upward arrow t. 


7.3. Lorentz Transformation in Space-Time 


7.3.1. Graphical procedure 


How to represent Lorentz transformation in space-time? We need the 
answer for a better understanding of Special Relativity. In this section, we 
shall demonstrate Minkowski’s graphical construction of Lorentz 
transformation, and use this construction to resolve the paradoxes of length 
contraction and time dilation. 

We have explained the procedure in Fig. 7.4. In order to make the 
drawings less clumsy we have replaced the (X!, X°)-axes with (X, Y)-axes, 
and the (xt, x?) coordinates with (x, y) coordinates. We shall explain the 
procedure in two steps. 


Step 1 : SET UP THE COORDINATE AXES (X', Y’), WITH SCALES SHOWING TIC 
MARKS AT UNIT INTERVALS. 


Fig. 7.4. Graphical construction of Lorentz transformation. 


The first part is shown in Fig. 7.4(a). Draw the X - Y-axes 
(perpendicular to each other) with the origin at O. The straight line OY’ 
making an angle @ with the Y-axis (@ < 45°) is the new Y-axis of 
transformation. T is the hyperbola: y? — x? = 1 and Y is its asymptote: y = x. 
A is the point of intersection of I with the Y’-axis. 


Intersection A: (£4, Ya) = (sin @, cos @). 


5 . > 
cos* © — sin^ © 


Now scale the Y’-axis by defining one unit as the intercept OA, equal to 


1 
Aly =y 2 + A = n, (7.8) 
y cos* © — sin” @ 


to be called scale factor. Now draw the straight line #1, which is tangent to 
the hyperbola at A, interacting the asymptote at B. 


dy ; 
tangent: —| = %4/V4 = tang, 
dx 
A 
#1: y— Ya = (x — T4) tang, 
l ©) 
(i 9) 
wv: y=2, 
Intersection B: (£a, Ya) = | ———— (1, 1). 
y cos © — sino 


Now complete the parallelogram OABC. The line OC extended onward 
is the X'-axis. The point C lies at the intersection of the X'-axis and the line 
# 2. 


#2: Y — Ye = (T — Vp) cota, 
X‘-axis: y = rtand, AA 
yY { i. 1 O ) 


Intersection C: (£e, Yo) = (cos ġ,sin @). 


y ci s? @ — siní Oo 


Now scale the X'-axis by defining one unit as the intercept OC, equal to 


——————— (7.11) 


P > 3 3 2 
cos* © — sın“ © 


Note that Ay = Xy- Therefore, the two axes have a common scale factor 


À E Àr =A, = yy 1+ 82, 
(7.12) 
where S=tan@<1, y= po is 
y 1-8° 


For the actual graphical construction of the LT, we have taken @ =15°. 
This gives B = tan @ = 0.2679; y = 1.038; A = 1.074. 


Step 2: TRANSFORM THE (x, y) COORDINATES OF A POINT P TO (x’, y’), USING 
THE SCALE FACTOR A. 

We have explained the steps in Fig. 7.4(b). Let us write the coordinates 
of the point P as (Xo, yo). The straight line © drawn parallel to the Y’-axis 
and passing through P intersects X'-axis at Q, and the straight line £ drawn 
parallel to the X’-axis and passing through P intersects Y’-axis at R. Then Q 
and R are the projections of P on the X’- andY’-axes, respectively. Let us 
find the (x, y) coordinates of Q and R: 


O: Y — Yo = (T£ — BW) cota, 
X'-axis: y= rtana, (7.13) 
Intersection Q: ( To, Yg) = AY(2q — Lyo) (cos ġ, sind), 


where we have used the identity 


cosl — tan? ò) = —. (7.14) 
A^ 
Similarly, 
D: Y — Yo = (£ — To) tana, 
Y '-axis: y = x(cot ġ), (7.15) 
Intersection R: (£r, Yr) = Ay(Yo — 82q)(sin ġ, cos ġ). 


The coordinates (x4, y6) of the point P, with respect to the X', Y’-axes are 
now calculated in the following way: 


L= OQ = y T2 + y2 = Ày(£o — Bug). (7.16a) 


K =OR= VŽ + y2 = Ayyo — B20), (7.16b) 
To = Lj À = ^ ( To = Jyo F { 7.16c ) 
ya = K/A = y¥(yo — B20). (7.16d) 


Note that we have not used any principle of relativity in the above 
geometrical construction. It was a mathematical exercise in coordinate 
transformation (x, y) > (x', y’) with the final result: 


y =7(y — Br), 1 
B< 1: y= —. (7.17) 
a! =N (£ — by ). vV 1 — 82 


This result is identical with the Lorentz transformation formulas derived in 
Eqs. (3.8). In this sense, the construction presented here can be called a 
Graphical Construction of Lorentz Transformation. 


7.3.2. Graphical construction of length contraction 


We shall resolve the paradox of length contraction graphically using space— 
time axes of the S and S’ frames, as depicted in Fig. 7.5. It is a partial copy 
of Fig. 7.4(a) with the following important differences: (1) changed the (Y, 
X)-axes to the (X°, X')-axes of the S frame, and similarly (Y', X’) to (X", 
X") of the S' frame; (2) expanded the axes so that the tic marks are further 
apart. 

However, since we shall depend on the construction shown in Fig. 
7.4(a), we shall use the same coordinates used there, namely, (y, x); (y’, x’), 
synonymously with the relativity coordinates (x°, xt); (x", x"), respectively. 

A meter stick, i.e. a rigid rod of unit length, is lying along the X'-axis of 
the inertial frame S' which is moving relative to S with the velocity Bc in the 
X-direction. L and R represent its left and right ends. The world lines of 
these two ends are shown as the straight lines LM and RN. The “world 
view” of the rod at x” = 0 is shown as a thick solid line of unit length, and 
at four other values of x” as thick, but broken lines, also of unit length. 


Path of the meter stick in physical space 


Fig. 7.5. Graphical construction of Lorentz contraction. 


Let us first understand what we mean by the length of the rod in S, with 
respect to which it is moving. We had provided the answer in the paragraph 
following Eq. (2.26) on p. 41. It is the distance between the points L and 


R°, marking the left and the right end of the rod on the X* axis 
simultaneously, i.e. at same the instant x° = 0 (same as t = 0), as the rod was 
speeding away along this axis. We have shown the motion of the rod in the 
lower diagram. 

Referring to Fig. 7.4(a), let R° be the intersection of the world line #2 


(i.e. the straight line CB) on the X'-axis. The intercept £, in Fig. 7.5, is the 
x! coordinate of R°, and represents of length of the stick in S. We shall 


calculate £ using the (xg, yg) coordinates of B from Eq. (7.9). 
#2: Y — Up = cot (T — Tp ), 


X-axis: y= 0, ji ' 
X-axis: y (7.18) 


jcoso+sin@ 
At R°: f = Tp —taney, = (1-5),/——>-— = L 


| coso — sin 


Conclusion: The length of a straight rod which is 1 m long in its rest frame 
is measured to be £ = 1/y in the frame S with respect to which it is moving 
with velocity cf parallel to its length. We have thus derived the length 
contraction formula by graphical construction. 


7.3.3. Graphical construction of time dilation 


We have explained the construction in Fig. 7.6. Again, the axes are the 
same as in Fig. 7.4(a), but the axes have been expanded even further so that 
the tic marks # 1 on the axes appear near the margins. 

We have presented two events O and A, both occurring at the same 
spatial location in S', or, to be more precise, at the same space coordinate x! 
= 0, but at two different time instants, separated by a time interval of one 
unit in the frame S'. This means that x? = 0 for O, and x” = 1 for A (as 
defined while writing Eq. (7.8)). In this sense, the proper time between O 
and A is 1 unit. 


Fig. 7.6 Graphical construction of time dilation. 


We want to find out graphically the time interval between the same two 
events in the frame S. The procedure is very ees Just find out the y 


os 


coordinate of A, from Eq. (7.7): Ya = Tar '. Hence the conclusion: 


24 — sin? 


Conclusion: If the time interval between two events O and A occurring at 
the same spatial coordinates in a frame S' is 1 unit (so that the proper time 
between the events is one unit), then the time interval between the same two 
events as measured in the frame S which is moving uniformly relative to S' 
with velocity cB, will be y units. 

Note from the figure that there is a certain spatial separation of € 
between the two events in S. 


7.3.4. Simultaneity, or absence of it 


On the same diagram presented as Fig. 7.6 we have tried to resolve the 
paradox surrounding simultaneity. Three events @a, @p, Oc, shown on the 
X"-axis are simultaneous in the frame S'. They occur at the same time x” = 
0. Projecting these three events on the X°-axis we find that they occur at 


different time coordinates 2°, x?, z? in the frame S. 


7.4. Minkowski Space-Time 


What is the length of the segment of the world line between two event 
points @, and @p? 

When I am sleeping, I am still walking a long way in space-time along 
the time axis. Can I say that the distance I have travelled is c times, say 6 
hours of sleep? 

During daytime I have commuted from Mysore to Bangalore, a distance 
of 140 km, in 3 hours. I have moved along some XY-plane, as well as along 
the time axis. Shall I apply the Pythagorean theorem to arrive at a distance 
of \/(3c)? + (140)? km, covered in space-time? 

We shall find an appropriate definition of “length” in space-time. The 
reader may ask, “What is the need for measuring length in an abstract four- 
dimensional world which we cannot even visualize before our eyes?” We 
need a measure of length because we are going to construct four- 
dimensional vectors in space-time. The most elementary such vector is a 


“directed straight line” from an event © to another event ®. Length is the 
only invariant (i.e. something that does not change with a change of the 
coordinate system) associated with a straight line, whether is space-time or 
in physical space. 

Let us review how length is defined in analytical geometry in terms of 
coordinates. Consider a straight rod whose endpoints A and B are at the 
coordinates (xj, Y1, Z1) and (X2, Y2, Z2) with reference to a Cartesian frame S. 
The length of the rod is given by the Pythagorean expression 


9 


; 9 ; 2 , 2 fox , 
t = (ə — T1) + (Y2 — Y1)" + (22 — 21)“. (7.19) 


If we look at the rod from another angle, say, by shifting my telescope, or 
better still by rotating my frame of reference, I shall still measure the same 
length £ of the rod. 

Suppose we perform a rotation of the axes by an angle 0 about the Z- 
axis, to obtain a new frame S'. It is an elementary exercise? to show that the 
above rotation causes a transformation of coordinates from (x, y, z) to (%’, y', 
z'), given by 


r = r cosh + ysing, 


y' = —r sin ĝ + y cos®ð, (7.20) 


Therefore, if (%4,y⁄1,21) and (z5, 44,25) be the Cartesian coordinates of A 


, 
2 


and B in S’, the transformed length £' of the rod in S' will be 


Applying the transformation (7.20) we get 


#2? — {(x2 — x1) cos + (Y2 — Yı )sin ð}? 
+ {—(r2 — x1 )sin + (yo — y1) cosO}* + (22 — 21)? 
= (£3 — £1)? + (y2 — Y1) + (22 — 21)? 


nD 


— ee 


In summary, the length expression as given in Eq. (7.19) remains 
invariant in our familiar physical space, under all transformations of 


coordinates due to rotation. Since Euclid’s geometry is valid in this space, 
we shall often refer to this space as the Euclidean space and denote it by the 
symbol E. The square of the distance d£ between two neighbouring points 
(x, y, Z) and (x + dx, y + dy, z + dz) in E? is given by the expression 


dé? = dx? + dy? + dz?. (7.23) 


We shall call Equation (7.23) an expression for the line element in E’. A 
formal name for the above expression is metric. Equation (7.23) expresses 
an Euclidean metric. 

We cannot expect invariance of the Euclidean metric (7.23) under a 
Lorentz transformation. This is because the length of a stick appears 
different to two different observers who are moving relative to each other, 
so that 2’ 4 £. On the other hand, if we consider two events © and ® with 
coordinates (ct1, X1, Y1, X1) and (ct, X2, Y2, Z2) in some Lorentz frame S, then 
the value of the expression 


J 9, 9 
s? = C Ito — tı) 


= Ê(t2 — t1)? — [(£2 — 21)? + (yo — y1)? + (22 — 21)?] (7.24) 


remains invariant under a Lorentz transformation (cf. Eq. (3.11)). 
Therefore, if we are looking for a candidate to represent “length” in an 
invariant way in space-time, then the expression give in Eq. (7.24) should 
satisfy the requirement. 

The letter s used above is the invariant “length” in relativity. However, 
since the word itself connotes a measure involving a meter-stick — as that 
of a reel of yarn or the width of a fabric — and since we cannot stretch a 
meter stick across space-time, it is better to suggest an alternative name, for 
which we choose “line interval”. 

The differential line interval between two infinitely close events having 
coordinates (ct, x, y, z) and (ct + cdt, x + dx, y + dy, z + dz) has a greater 
relevance in the geometry of space-time. We shall write this as 


ds? = c* dt? — dr? — dy? — dz”. (7.25) 


Equation (7.25) will represent the line element, or the metric, better known 
as the Minkowski metric, for space-time. Any LT from a frame S to another 


frame S' will guarantee invariance of this metric: 


ds? = c*dt? — dr? — dy? — dz? = edt? — dr? — dy’? — dz”. (7.26) 


This is a consequence of the definition of LT as written following Eq. 
(3.11). 

When the reader will take up study of the General Theory of Relativity 
he/she will realize that it is not possible to obtain the simple metric of Eq. 
(7.25) in the presence of a true gravitational field. Massive gravitating 
objects, like pulsars, quasars, black holes, distort the geometry of spacetime 
grossly thereby invalidating the Pythagorean expression (7.23) for the 
length of a “straight line”. However, for our study of Special Relativity, 
from which gravity is excluded, we shall always assume the Minkowski 
metric of Eq. (7.25). The space-time whose geometry is described by the 
Mankowski metric is called Minkowski space-time, and will be denoted by 
the symbol M*. 

The reader may have already noticed that the right-hand side of Eqs. 
(7.24) and (7.25) are not necessarily positive. In fact, they can be positive, 
negative, even zero. When the events ©, ® for which we have written the 
line interval (7.24) are on the world line of a material particle, s* > 0, 
because all material particles move slower than light. In that case, we say 
that the line interval between these events, or the line element joining these 
events, is time-like. If s* < 0, the interval is space-like. On the other hand, if 
s? = 0, the interval is light-like. (We have already defined these terms in 
Sec. 3.4.) 

Suppose a star, which is 10 light years away, has “exploded today” into 
a supernovae. Call this event ©. We shall see this supernovae 10 years later 
in the form of a brilliant flash of light. Let this “seeing at a certain 
observatory” on earth be called the event ®. Then the line interval between 
these two events is zero, according to Eq. (7.24), even though the 
intervening “distance” between the events is 10 x 365 x 24 x 60 x 60 x 3 x 
10° km!! 

In Sec. 7.8, we shall characterize a 4-vector as time-like, space-like or 
light-like depending on whether the square of its length is greater than, less 
than, or equal to zero. 


7.5. 3-Vectors, Contravariant and Covariant Families 


We shall be using vectors and tensors in E? as well as M4. The new species 
of vectors we shall be using in M‘ will be called 4-vectors, a term coined by 
Minkowski. To distinguish our familiar vectors, used so far in non- 
relativistic physics, we may refer to them as 3-vectors. 

It is easier to build the concept of vectors in E? andthenextendthe same 
to M4. A vector in E’ is a directed straight line segment with the additional 
property when two of them, A and B are added to obtain their sum C = A + 
B, this operation is carried out by constructing a vector triangle. 

The first one of these above two properties gives a geometrical 
character to a vector. If we have chosen a frame of reference, then the 
length and orientation of a vector V will determine its components Vy, Vy, 
V, along the X-, Y-, Z-axes, which are obtained by projecting the vector 
along these axes. 

Conversely, given the components Vx, Vy, Vz of a vector V with 
reference to a frame of reference, we can construct the vector V in space as 
a geometrical straight line. Therefore, it is often a practice to express a 
vector as an ordered triple of real numbers, which can be written either as a 
column, or as a row matrix. 


V= | V, |; alternatively, V = (Vz,Vy, Vz). (7.27) 


In view of the above matrix representation, the triangle rule of vector 
addition is equivalent to a matrix addition: 


A; + B, 

A+B= | A, + B; 
(7.28) 
A, + B, ' l 


alternatively, A + B = (Az + Bz, Ay + By, Az + Bz). 


The most elementary vector, which also serves as the patriarch of one 
family of vectors known as contravariant vectors is the infinitesimal 


displacement from a point P(x, y, z) to a neighbouring point Q(x + dx, y + 
dy, z + dz), so that 


dr = (dx, dy, dz). (7.29) 


When we change over from a frame of reference S to another one S’, which 
is rotated with respect to the first one, the coordinates get transformed 
according to Eq. (7.20). As a consequence, the vector dr changes its 
components from (dx, dy, dz) to (dx', dy’, dz’), where, 


dx’ = drcos@+ dy sin ð, 
dy’ = —drsin@ + dy cosð, (7.30) 


dz’ = dz. 


Starting from the primary vector dr one obtains, through multiplication 
with scalars (e.g. 1/dt, m), and differentiation with respect to time t, other 
members of the contravariant family are obtained, e.g. 


1 dr | ee 
velocity v = — x dr = — (multiplication), 
dt dt 
, dv = NIPO 
acceleration a = — (differentiation), _ 
dt (7.31) 
momentum p=mxv=mv (multiplication), 
> dp ne À 
force = — (differentiation), 
dt 


and so on, all of which share the transformation of dr. One can now define a 
set of three real numbers V = (V,, Vy, Vz) to constitute a contravariant 


vector, if they transform under a change of coordinates exactly like the 
components of dr. The rotation of the vector V will change its components 
from (Vx, Vy, Vz) to (Vz Vy: V2), such that 

V! — Vo cos + Vy sin @, 

V; = -V, sin @ + Vy cos A, (7.32) 


V! =V,. 


To be more precise and general, we replace the particularly simple 
transformation formula (7.20) to 


ke E a 9 , aky y a e jm 
r" = fEl! 2?, r3 j= ia r), k=1,2,3, (7.33) 


where we have written (x) to mean all the old coordinates (xt, x’, x°). Note 
that we have used superscripts (1,2,3) on x to mean x, y, z, respectively. A 
superscript will be called a contravariant index. We shall follow this 
practice in the remaining part of this book. 

Equation (7.33) represent a general change-over from the old 
coordinates (x) to (x') — most common examples being (1) Cartesian to 
Cartesian due to rotation of axes, just cited, (2) Cartesian (x, y, z) to the 
spherical coordinate system (r, 0, @), or vice versa.‘ 

The coordinate differentials (dx’) in the old system, and (dx) in the new 
system, are connected by the linear expression 


3 l 
E Ag" x 
dr" = — dr = ) R"; da, k = 1,2,3, (7.34) 


Rt, = — = 2, k,j=1,2,3, (7.35) 


constitute a 3 x 3 matrix #. They are, in general, functions of the 
coordinates (x). For the special case of rotation, these coefficients are 
independent of the coordinates and constitute an orthogonal matrix R. An 
orthogonal matrix is one for which its inverse equals its transpose, as we 
have explained in Eq. (7.45) below. 

Note the dot “.” followed by a small space in the subscripts, to imply 
the second position for the subscript, i.e. its role as the column index. The 
superscript k and the subscript j in R*, constitute the row index (first index) 
and the column index (second index), respectively. 

For the case of the rotation with transformation represented in Eqs. 
(7.30) and (7.32), R!, = cos@; R', = siné;.... We write the complete matrix: 


cos@ sinf O 
R=1]|-sin@ cos Of. (7.36) 


0 0 1 


Though simple, the above matrix is used to construct the most complex 
rotation matrix of a rigid body, consisting of precession, spin and nutation.“ 

In order to avoid writing the summation symbol repeatedly — for 
which occasions will arise copiously in the sequel — it has been a common 
practice among relativists to adopt the Einstein Summation Convention. For 
this purpose, we introduce a contravariant index as the one which appears 
either as a superscript in the numerator, or as a subscript in the denominator. 
Its opposite is the covariant index, which, therefore, appears either as a 
subscript in the numerator or as a superscript in the denominator. In the 
expression (7.34), for instance, k appears as a contravariant index on either 
side of the equation, whereas j appears first as a covariant index, and then 
as a contravariant index. 

Einstein convention proposes that if in a certain term of an expression 
the same index appears twice, once in the contravariant form and another 
time in the covariant form, then that index is to be interpreted as a 
summation index, and the formal summation sign È is to be dropped. 

Adopting this convention we rewrite Eq. (7.34) as (compare this with 
the convention adopted on page 147) 


ðr 


dr'* = aye’ = R¥ de, k = 1. 2. 3. (7.37) 
OT- fa 


A summation index, e.g. j in Eq. (7.37), is a dummy index. A dummy 
index can be replaced by some other dummy index. For example 


da!* = R¥, dz = R* „ dz™. (7.38) 
We shall complete the suggestion following Eq. (7.31) to define a 
contravariant vector A to be an ordered triple of real numbers (A‘) that 


transform under a coordinate change (x) to (x’) of the form (7.33) to another 
triple (A7), such that 


Ox'* 


At = RÝ, A’. where RF. = (7.39) 


Ox)’ 


with k, j = 1, 2, 3, are the components of a 3 x 3 matrix Ê. 

Note that we have indicated the components of the contravariant vector 
with a superscript. 

Equation (7.32) is a special case of (7.39), in which the components R*, 
of Ê are constants, i.e. independent of the coordinates. This happens in the 
case of Cartesian to Cartesian transformation, like rotation of the axes. 

Another class of vectors extensively used in physics is the covariant 
family. Their prototype, the gradient of a scalar field W(x, y, z) is denoted as 
Vw. Let us write 


G = Vy, 
Əy (7.40) 


having components Gk = Tk’ k = 1, 2,3. 
Ox 


zj 


Under the transformation (7.33), these components will change to Ge = > 
in the system (x’), Using the chain rule of calculus, one then gets 


Ow Ow Oxi 


= = —— = GM’, 
k Oat Əri Ax'* = _ 
( T.41 ) 
j _. OF! 
where M’, = BrE 
are the components of a 3 x 3 matrix M. 
Let P = MR be the product of the two matrices M and R. Then 
: Ax) ðr Ox! 
— =- = —— = (). 
£ Ərx'F Ort Ort £ (7.42) 


or MR=1= identity matrix. 


Equation (7.42) shows that the matrix M is identical with @'. We can 
therefore rewrite the transformation equation (7.41) for the gradient vector 
as 


Gi, = G,(R-1)7. (7.43) 


Therefore, we define a covariant vector V as an ordered triple of real 
numbers (V;) that transform like Vw under a change of coordinates (x) to 


(x’). This means that the transformed components of the vector will be: 
Vi =V (RX. (7.44) 


Note that we have indicated the components of the covariant vector with 
a subscript. 

It is then clear that the contravariant vectors and the covariant vectors 
have distinctively different characters. Isn’t it then strange that in physics so 
far we have treated them alike and represented both by directed straight 
lines? 

The reason is that the formal difference between Eqs. (7.39) and (7.44) 
disappears when we consider transformation from one Cartesian system S 
to another Cartesian system S due to a rotation of the axes. The 
transformation matrix Î is then an orthogonal matrix, as stated following 
Eq. (7.35). By its definition, the inverse of an orthogonal matrix is its 
transpose. That is, 


-1 — (trans 
(7.45) 
or ( o~! ie = O;. 


Using this property, we can rewrite (7.44) as 
3 
Vg = X RE, Vj. (7.46) 
k=1 


The above transformation rule is identical with (7.39). It is for this reason 
that the discriminatory labels “contravariant” and “covariant” are 
unnecessary in classical physics. 

When the components of a 3-vector V change from {V;} to {Vi}, 
through an orthogonal transformation, its length, by which we mean its 
magnitude, does not change, i.e. 


v? = 


[v]? (7.47) 


3 
K= 


3 


as exemplified in Eq. (7.22). We shall adopt Eq. (7.47) for defining a 3- 
vector. 


7.6. 3-Tensors, Contravariant and Covariant Families 


A vector is a geometrical object that does not change with a coordinate 
transformation, like a hexahedron which remains the same geometrical 
object no matter from which angle you look at it. This means that if under a 
coordinate transformation (x) to (x’) the scalar components of a vector V 
change from (Vy, Vy, V,) to (Vr,¥y,V2), then the unit vectors (ex, ey, ez) will 
change to (e’.,e,,,e!) such that 


Vie. + Vyey + Ve, = Viel + Viel, 4+ Viel 
Same as, Ve; zan Vi*el. 
However, V“* = R*, VI, from Eq. (7.39). 
Hence, e;Vi = ef RF, Vi. 
or e; = eR"; 


or (R-1) e; =e R*.(R-1P, =e 85, = œ. 
\ 1, 2°) kr". jS '.{ k-.€ l 


Note that (R-+}, represents the j£ component of the matrix ~t, the inverse 
of È. Hence the theorem. 


Theorem 7.1. Let 2 represent the transformation matrix corresponding to a 
change of coordinates (x) > (x'), so that a contravariant vector A changes 


its components from (A‘) to (A") by the rule given by Eq. (7.39). Then, the 
transformation of the unit vectors (e;) > (e}) will be given by the rule: 


t / D-1; Iw 20. 
e; = &(kK DP, (7.48a) 


and the converse, 


e; = e; R’,. (7.48b) 


We shall obtain the transformation formula for the contravariant 3- 
tensor T'* using Eq. (7.48)(b). 


Let T =T" ee, = Teel. 


Now, T*exe, = T™* {e1 R HeR p} = {R ; R T" Jere}. 
Hence, 


(7.49) 


Going back to Eq. (7.39) we draw the following conclusions: 


e A scalar is a real number that does not change under a coordinate 
transformation. 

e A vector is a row or column matrix of three real numbers that transforms 
through a single application of 2 as shown in (7.39). 

e A tensor is a 3 x 3 square matrix that transforms through a double 
application of R as shown in (7.49). 


We shall illustrate the concepts outlined above by citing the example of 
rotation about the Z-axis for which we had written the transformation 


matrix Ê in Eq. (7.36). Its inverse Ê`! is its transpose. We shall write Ê and 
t side by side for contrast: 


cos sinf 0 cos -sinf 0 


R= | -sinf cosð 0|. R-!=] sinf cosð ol. (7.50) 
0 0 1 0 0 i 


The base vectors should now transform as 


= e,(R-} Ep 
so that, ej = eı(R-! ie + e2(R-! 3, = cos fe; + sin es, 751 
(7.51) 
eh = e;(R-')!, + e2( R71)? = — sin ĝe1 + coseg, 


Comparing Eq. (7.51) with Eq. (7.32), we find that the vector components 
(Vi VgV!) are the same linear combinations of (Vy, Vy, Vz), as the unit vectors 


(e},e,e8) are of (e1, e2, e3), even though they follow different rules of 


transformation. This is because in the first case the matrix Ê appears before, 


and in the second case the matrix Ê`! appears after, the quantities being 
transformed. Let us highlight this. 


yi cos sinĝ 0 yV! 
V2 | = | —sin@ cosé 0 V? 
ys 0 0 1 y3 


V'cos#+V?sin@ 


= | -V'!sin@+ V? cos |, (7.52) 
y3 
cos@ —sin@é 0 
(ej e%e3) = (e1 €263) | sin? cos@ 0 
0 0 1 
= (cos ĝeı + sin ĝe, — sin ĝeı + coses, es). (7.53) 


We shall now obtain the transformed components of a contravariant 
tensor T under the same rotation of the axes (about the Z-axis) for which the 
transformation matrix Î is given in (7.50). We shall illustrate the procedure 
for only one component, namely, T+. 

pz — R1, RT 
= R! R?,T™ + R), RT? + R! R?T™ + RigR?,T” 
+ RI R? T?? + RR T™ + Ri, R3,T® + Ri RT 
+ R! R? T 
= (cos@)(— sin @) T! + (cos A) (cx s8) T! 


+ (sin @)(— sin 8) T?! + (sin 8) (cos 0) T??. (7.54) 


The reader should complete the work by working out the transformation of 
the other eight components. Here are all the transformed components: 


T”! = T" cos? 6+ (T + T?) cossin + T sin? 0, 
T"? = T cos? 8 + (T? — T!!!) cos sin — T”! sin? 0, 
T” = T! cos 0 + T” sinð, 
T?! = T?! cos? 0 + (T? — T!!) cos@ sind — T”? sin? 0, 
T’? = T” cos? 0 — (T + T?!) cos sin + T!! sin? 0, (7.55) 
T’3 = T” cos — T" sin, 
T”! = T’! cos + T* sin9, 
T"? = T? cos — T”! sin9, 
T33 = T38 
A contravariant 3-tensor like can find its companion in the covariant 
3-tensor P such that P - T is a 3-scalar. The transformation rules (7.71), 


(7.75) and (7.49) would suggest that the components of P should transform 
{Pmn => Fount as 


(7.56) 


Equations (7.49) and (7.56) constitute the transformation rules for a 
contravariant and a covariant 3-tensor. 

We shall show that the above transformation rule will ensure invariance 
of È - È under the transformation Ê. 


Proof. 
Pin T"™ ={P;(R*) (RY) {RGR} 
= {(R Y m RAH (Rn Rf PjeT™ 
=p i Pu" = PaT”. (7.57) 


There is one tensor which is not related to any physical quantity in 
physics. It is the metric tensor, which we shall denote as yg, and its 3 x 3 
components as {g;j; i, j = 1, 2, 3}. In Relativity, particularly in the General 
Theory of Relativity, g plays the most dominant role, being synonymous 


with the gravitational field itself.“ The metric tensor defines the geometry of 
Space, or space-time, through an expression of the line element. For a 
Cartesian coordinate system in E?, the expression of the line element is 
written in Eq. (7.23). For a spherical coordinate system (r, 0, Ø), {x! = r; x? 
= 9, x? = $} and the same line element is written as 


dé? = dr? + r7d6? + r? sin? 0. (7.58) 


The metric tensor’s role in E? is to write all line elements as 


2 1 - FF \ 
dé* = g;;dx*dr’. (7.59) 


It is then seen from (7.23) and (7.58) that 


g=9 =06;;= | 0 1 O |; Cartesian coordinate system, 
(7.60) 


; spherical coordinate system 


0 0 (x! sin x7)? 


Why do we call 9, whose components are defined by Eq. (7.59), and 
exemplified by (7.60) a tensor? The reason lies in the Quotient Theorem 
stated in Sec. 7.9.4. The left-hand side of Eq. (7.59) is a scalar, whereas 
dx'dxi is a contravariant tensor of rank 2 (having two superscripts). 
Therefore, gj should stand for a covariant tensor of rank 2 (i.e. with two 
subscripts). 

We shall demonstrate the above tensor character of g by showing that 
under an orthogonal transformation (e.g. rotation of coordinate axes), the 
transformation rule (7.56) will convert g="") (same as ô;j) to itself. 


Proof. 


Win = õi; (RT! vale re = (R! = ( R! re 


E R” ( R- l )? n— dmn . ( 7.61 ) 
(QED) 
We shall further demonstrate the tensor character of ġ by showing that 


when the coordinate system changes from Cartesian to spherical, the same 
transformation rule will convert g{*"") to gr»). 


7.7. Transformation of the Metric 3-Tensor from Cartesian to 
Spherical 


As in Eq. (7.61), 


Ginn = Gn = lR S mR Yn = (RY (ROY) n. (7.62) 
The transformation from (x) to (x’) is given in Eq. (5.35). (R) m = 2. 


Therefore, (R-')'; = # = sin@cos@; (R~*)', = 4 = reos@cosd, etc. We therefore 
have the following (inverse) transformation matrix: 


sinflcos® recos@cosé® —r sin ĝ sinó 
R! = | sin@sind rcosĝsing rsinĝcoso : (7.63) 


cos@ —rsin@g 0 
It then follows from (7.62) that 
FIT” = (RY AP + (RA + (RP = 1. 


In this way, the other components of 9") as given in (7.60) can be 
obtained. 

In writing equations of physics that respect relativity, ğ remains in the 
background, changing indices — contravariant to covariant and vice versa, 
and in the process helps us write the laws and principles of physics in the 
relativistic, covariant language. 


7.8. 4-Vectors in Relativity 


The simplest 4-vector, by which we shall mean a vector in Mź, that comes 
to one’s mind is a directed straight line segment ôT stretching from some 
event © (which could have occurred yesterday) to some other event Y 
which mig have occurred in the past, or may occur tomorrow. Since such a 
vec spans time as well, we cannot represent it with an earthly rod. 4-vect 
cannot be “pictured”. We can only write them on a piece of paper as 
ordered quadruple of four real numbers, e.g. 


ST = (62°, ôx! , 5x7. 5x°), (7.64) 
or draw a space-time diagram using 4 axes cT, X, Y, Z with one of the axes, 
e.g. the Z-axis suppressed, and show it as a geometrical straight line, with 
an arrowhead showing its direction from © to Y. 

Note that, the last three numbers, i.e. the ordered triple 5x!, 5x2, 5x), 
constitute a 3-vector dr, stretching from the spatial location of © to the 
spatial location of ¥. We shall often find it convenient to write a 4-vector as 
a 1+3-component object, e.g. (5x°, dr). 

The primordial 4-vector, or, the most elementary 4-vector, the ancestor 
of all 4-vectors of the contravariant 4-vector family, is the infinitesimal 
displacement 4-vector d? in M4, stretching from an event © to an 
infinitesimally close-by-event Y, the time-space separation between them 
being (dx°, dxt, dx’, dx). Depending on our convenience, we shall write 


this vector, and its progenies, in four different ways (cf. Sec. 7.1): 


dT = (dx) = (dr°, dx’. dx? dx?) = (dr®, dr). (7.65) 


Note that in the last equation we have clubbed the three spatial 
components of the elementary 4-vector d? as dr = (dxt, dx?, dx?) = (dx, dy, 
dz). These spatial components, isolated out under the banner dr, constitute a 
3-vector, the object of our discussion in Sec. 7.5. We shall give a better 
definition of 3-vector below Eq. (7.71). 

Under a general coordinate transformation (x") to (x"), u = 0, 1, 2, 3, 


T = fP(2), (7.66) 


the coordinate differentials (dx“) will transform to (dx), according to the 
rule 


dr" = ——dr” = dx”, p= 0,1,2,3, (7.67) 


where the coefficients 


Ox"! 


á Orv’ 


u,v = 0,1,2,3, (7.68) 


constitute a 4 x 4 matrix ©. In the case of Lorentz transformation, the 
transformation coefficients are constants, as in the case of rotation in E’, cf. 
Eq. (7.36). It is the same LT matrix written as Eq. (3.14) in Sec. 3.2, and 
copied into Eq. (7.69) below, in which we have also spelt out the time index 
0, and space indices 1, 2, 3 for the four rows (arranged vertically on the left) 
and the four columns (arranged horizontally at the top): 


f= 3 —73 A 0 0 (7.69) 


By analogy with Eq. (7.39), we now define a contravariant 4-vector to 
be a vector 


X = (A") =(A° A), (7.70) 


whose components transform from (x) to (x’) according to the rule: 


A® =Q", A”, p=0,1,2,3. (7.71) 


Note that A° is the time component, and A is the space component (or, 
rather the three spatial components taken together) of the 4-vector A. We 
shall treat A as a 3-vector, as mentioned below Eq. (7.65). Confining 
ourselves to Cartesian systems (so that we do not have to invoke metric 
tensor) a 3-vector A possesses the property of invariance under an 
orthogonal transformation (e.g. rotation of the coordinate axes, but not 
boost), as per the definition given below Eq. (7.47). 


We shall now come to the four-dimensional covariant vector. Imagine 
the gradient vector Vw upgraded from E? to M*. We get the 4-gradient of a 
4-scalar field Vv. 

A 4-scalar field P(x) as a single component field, i.e. represented by a 
single function of the four coordinate (x), such that the value of the 
function does not change at any event point “©” under a change of the 
coordinates from (x) to (x’) due to a transformation of the type (7.66). In 
other words, if P(x) > W'(x’) under the above coordinate transformation, 
then W'(x’) = W(x). 

Now we define the covariant 4-gradient of such a scalar field Y as the 
4-component field: 


— 
G = Vy, 
( Why 


having components G, = V aY = Srp’ u = 0,1,2,3, ' 
L d 4 ‘ . 


10vV OV OV AW 


or G = (V,Vv) = papal ei Meta yates 
c Ot ör Oy öz 


Note that we have used a leftward arrow to imply a covariant 4-vector, 
and a subscript to indicate its components, in contrast with the contravari- 
ant 4-vector (in Eq. (7.70) for which we have used a rightward arrow,and 
superscript to indicate its components. 

Under the coordinate transformation (7.66), the 4-vector G changes to 
G’, and their components change: 


Ow’ OW Ox" 


~~. ~ Ba Da = 99 for ore 
Ox" E ðr” Ore’ pe = 0,1, 2, 3. (7.73) 


Viv Vi. py’ = 
Analogous to Eq. (7.42), we have here 
Ox" Ox" Ox” te 
—_ — = =——— =). (7.74) 
(= ) ( Or ) Orn i 
Going back to Eq. (7.68) we now identify (=) as the 9“, component of 


the matrix , and now using (7.74), (25>) as the (2-')’,, component of the 
inverse matrix ~t. 


We therefore define a covariant 4-vector B = B, = (Bo, B1, B2, B3) to be 
an ordered quadruple of real numbers that transform under the coordinate 
transformation (7.66) as 


B= bar's. (7.75) 


The covariant gradient G = YY obviously satisfies this requirement. 

As a concrete example we shall write down the Lorentz transformation 
equations for a contravariant 4-vector A and a covariant 4-vector G=Vu 
corresponding to a boost cBe, in the X!-direction. The relevant matrix has 
already been given in Eq. (7.69). Its inverse can be determined either 


rigorously using matrix algebra, or by a short-cut, replacing f with —f. 


Qt=1 |8 y 00 (7.76) 


Using the matrices (7.69) and (7.76) in the transformation formulas 
(7.71) and (7.75), respectively, we get the transformed components of these 
two 4-vectors: 


A” _ (A? = BA! ) 

A" = (A! — BA®), (7.77a) 
A? = A?, 

A” = A3. 


Bo = (Bo + B1), 

Bi = y(Bı + BBo), (7.77b) 
B; = Bao, 

B} = B}. 


For future use we shall also write the LT formula for the contravariant 
vector A corresponding to a general boost: [S(cB)S'] along any arbitrary 
direction n. That is B = Bn. For this purpose, we adopt the formula (3.18) 


given in Sec. 3.2, and replace ct with A? and r with A. The result is [24] 


A” = 4(A°— B.-A) (7.78a) 
= AP + [(y- 1)A° — yB- Al, (7.78b) 
A’ = (yA, + AL) — yBA° (7.78c) 
=A+ lig . A)B — y7BA° (7.78d) 
= A + n|(y — 1)(n- A) — y8A”]. (7.78e) 


The corresponding Lorentz transformation matrix is the same as Eq. (3.20). 
The inverse of the above transformation, corresponding to the boost: [S' 
(-—B)S], is obtained by replacing B with -p (same as replacing n with —n) 
and (A®, A) with (A’°, A): 


A? =7(A% + B.-A‘) (7.79a) 
= A” + [(y —1)A% +78 - A’, (7.79b) 
A = (yA +A‘)+ 738A” (7.79c) 
t Yy— l, fy 10 fe e \ 
= A' + |——(3-A‘')B+ BA" (7.79d) 

82 
= A’ + n[(y —1)(n- A’) + 78A”]. (7.79e) 


As for a graphical representation of vectors in M4, we shall continue to 
rely on the straight line picture for the contravariant vectors. The covariant 
vectors can be represented by a succession of parallel planes. The 
orientation of the planes (i.e. the normal drawn on them) will indicate the 
“direction” and the spacing the magnitude. The closer the spacing, the 
larger the magnitude. 


7.9. 4-Tensors in Relativity 


7.9.1. Contravariant, covariant and mixed tensors 


Tensors of relativity will be defined as matrices subject to specific 
transformation rules. A 4-scalar is a tensor of rank 0. A contravariant vector 
and a covariant vector are both tensors of rank 1 and can be represented as a 
4x 1or1 * 4 (i.e. column or row) matrices. A tensor of rank 2 isa 4 x 4 
matrix. 

By an extension of the defining equation (7.49), we shall call the 
elements of a 4 x 4 matrix (T"”) as constituting a Contravariant Tensor of 
rank 2, if under a coordinate transformation of the form (7.33), they change 
into (T'¥”) according to the following rule: 


TY oT! = QE YT”. (7.80) 


Similarly, by an extension of the defining equation (7.56), a 4 x 4 
matrix (K,,,) which transforms into (4;...) according to the rule: 


aw > Kl = Kaol Q7! JS (975P, (7.81) 


will constitute a Covariant Tensor of rank 2. 
On the other hand, the matrix (Q*,) which transforms into (Q’,) according 
to the rule: 


QE, => Q'E = Q4 R(E, (7.82) 


will constitute a Mixed Tensor of rank 2, having contravariant character 
with respect to the first index, and covariant character with respect to the 
second one. Note that 9#, and Q:" are not the same tensor. 

By obvious generalization, one can extend the definition to any All] x 
4l] x... x ql"! matrix and call it a tensor of rank n by ascribing to it the 
necessary transformation properties. For example, the 4 x 4 x 4matrix (A“,,,) 
will represent a tensor of rank 3, in which a contravariant index is followed 
by two covariant indices, if 


AF — A” 


He > AF, = HAM, (TE (QTI. (7.83) 

As an illustration, we shall obtain the transformation equations for a 
contravariant and a covariant tensor of rank 2, corresponding to the boost: 
S(B, 0, 0)S’. 


First the contravariant tensor. We shall illustrate the procedure by 


obtaining the transformed component T°. The relevant LT matrix Ê is 
shown in Eq. (7.69): 


400 o 0 8 
T’™ = 0°. 0°,T° 
0 0 00 0 0 01 0 Q0 10 0 0 11 
=Q oR oT + OP) 2°, T + ONO + O°, 0°,T 


Note that we avoided writing zero terms, like 9°,9°,T, etc. 

Next, the covariant tensor. We shall demonstrate the procedure for the 
transformed component Aj». The relevant inverse LT matrix is shown in Eq. 
(7.76). Hence, 


Kio = Kas(Q71)% (971), = ¥?[Koo + 8(Ko1 + Kio) + 8K11]. (7.85) 


We shall now write all the transformed components corresponding to 
the boost: S(6, 0, 0)S'. 


TIO = y2/P0 — B(T% +T) 4 8?TY], T"? =y(T® — gBl12), 
TM = y2/T — B(T +T!) 4 g2710) T’? =y(T® — BT?!) 
TNO = 42/710 — B(T +T!) + 82T”), T" _ (TOS — BT!3), 
TM) = 42T — B(T” 4.710) 4 g2700) T =4(T® — gp!) (7.86) 
T"!2 = »(T12 — BT°?), T"?! = (72 — BT), 
T"3 = (T! — BT”), T”! = (T3! — BT”), 
T22 -T2 TP'B=TB, T =T3 733 _ 73. 
Kho = 7° [Koo + 8(Ko1+ Kio) + 8? Ku], Koo = 7(Ko2+ 8K12), 
Kh, = y7[Ko1 + 8( Koo + K11) + 8? Kio], Kho = y(K20 + 8K), 


á 


Kio = 7° [Kio + 8(Koo + K11) + Koil, Kés = Y(Ko3 + 8K13), 
Kj, = y?[Ku1 + 8(Ko1 + K10) + 8? Koo], K4 9 = Y(K30 + 8K31), 
Kiz = Y(Kı2 + BKo2), K}, = y(Kai + 8K20), 
Kis = ¥(K1s + 8Kos), K3, = y(Ksa1 + 8K30), 


K2 = Ko, K53= Kə, K2=K32, Kg = K33. 
(7.87) 


7.9.2. Equality of two tensors 


If the components of two tensors A and B are equal in a preferred coordinate 
system, say S', then their components are equal in any general coordinate 
system S, and we say that A = B. 

We shall prove this theorem assuming that A, B are contravariant tensors 
of rank two. The two tensors being equal in S', the components of the tensor 
Ñ = å - B must the zero in S’. That is NY” = 0. To get its components N” in 
S, we apply the inverse of the transformation given in (7.80): 


yw aN (a =o. (7.88) 


(QED) 


7.9.3. The metric tensor of the Minkowski space-time 


We had introduced the concept of metric, and metric tensor, in Sec. 7.4, 
through Eqs. (7.23), (7.25), and again in Sec. 7.6, through Eq. (7.59), and 
casually mentioned the pivotal role it plays in relativity, especially General 
Relativity. The metric tensor g,,, of Minkowski space-time is defined as 


ds? = Juv dx” dx”. (7.89) 


The Minkowski metric was written in (7.25), which we rewrite in our 
new index format as 


ds? = (dx? }? — (dx! }? — (dx?)? — (dr? Y. (7.90) 


Comparing the two equations, we get the Minkowski metric as the 
following 4 x 4 matrix: 


ho = d 0—1 0 0 (7.91) 


We shall now establish the tensor character of g,,,. We cannot follow the 


path of (7.61), because in this case the transformation is not an orthogonal 
one. We shall demand that under the transformation (x) > (x’) the metric 
must remain invariant. That is, 


2 : i v ! Jata 8 ia a9) 
ds* = guy dx” da” = go da'* dr”. (7.92) 


The right-hand side is 


E [DE ht \ OY nl’ 
Jag ( 2", dx*)(2" a dx”), 


leading to the transformation rule 


Os =O ait’, t a: (7.93a) 


7 


and its inverse: ghg = Juv (QTI (QT!) 5. (7.93b) 


Equation (7.93b) conforms to the transformation rule for a covariant tensor, 
as defined in Eq. (7.81). Hence, g,,y is a covariant tensor of rank 2. 


(QED) 


If the transformation (x) > (x') is an LT, so that the changeover is from 
one Cartesian system to another, then as per (7.92) 


ds? = (dx? \2 — ( dr" \? — (dx'?)? — (dx ja: (7.94) 
so that 
1 0 0 0 
, 0-1 0 0 _ 
Ja3 = = Jas: (7.95) 
jid 0 0 —1 0 
0 0 0 —l1 


The same structure of the metric tensor is preserved under LT. This can also 
be seen by working out the components of g4s directly from (7.93b). It now 
follows from Eq. (7.93b) that 


Juv = Gap Va 8- (7.96) 


We shall now define a Lorentz transformation to be a linear transformation 
— from one Cartesian system (x) to another (x') — such that the 
transformation matrix ¢ satisfies Eq. (7.96). The simplest example of such a 
matrix ĝ is (7.69), and a more general one is (3.20). Also note that the 
above definition is a restatement of the property of the LT given in Eq. 
(3.44). 


7.9.4. New tensors from old ones, index gymnastics 


A special significance of the metric tensor is that it is instrumental in 
generating a class of “converts” through two operations, namely, lowering 
and raising an index. 


Lowering an index 

In Eq. (7.64) we had written the primordial contravariant 4-vector d? as dx” 
= (dx°, dx', dx’, dx?) = (cdt, dx, dy, dz). One can now define four 
infinitesimals dx, by the operation: 


dr, = Ju dX”. (7.97) 


It follows from the g,,, matrix given in (7.91) that 
dz, = ( dxr®, —dz!, —dx?, —dxr*) = (cdt, —dr, —dy,—dz). (7.98) 
In general, any contravariant vector V” can find its covariant 
counterpart V,, through the operation: 


Va = GV”. (7.99) 


This V,,, as much as dx, is a covariant vector. We shall prove. 


r 


Proof. Consequent to Lorentz transformation V, should change to V, as per 
the rule: 


Vi = IV" = Gap(2-*)9, (QEV. 


pät 


yrarg-ly\a s8 _ trayao-tl|y\a 
Jas (Q aT. 05 — Jag h (a2 ) H 


= Vin); (7.100) 


This confirms the covariant character of Vie 


(QED) 


The operation (7.99) is called lowering an index. It consists of 
multiplication with g,,,, followed by a summation over the contravariant 
index that is to be lowered. We can geneleralize this lowering operation to 
cover any tensor that has at least one contravariant index. For example, 
consider the tensor 7“,". We shall show two operations, the first one will 
lower the first contravariant index u, and the second one will lower the 
second contravariant index o. 


M 259 Sa (7.101a) 


Mm =+] Te ET gor (7.101b) 


L 


The reader should prove that the new matrices (a) Tps, (b) Tho 
obtained by the lowering operations are tensors. 


Raising an index 


The converse of the lowering operation is the raising ofan index, for which 
one employs the contravariant metric tensor g”, defined by the relation: 


g Quy = OF. (7.102) 
The above equation means that in any coordinate system, (g”) is the 
inverse of (g,,). In particular, if the coordinate system is Cartesian, then 


(reader verify) 


of” = qu = l (7.103) 


As in the case of the lowering operation, we define the raising 


operation as multiplication with g”, followed by a summation over the 
covariant index (the tensor being operated upon must have one or more) 
which is being raised. We shall show two examples: 


F, > T => F" = g"° Fa; (7.104a) 
} u f 


‘he hae hn aa (7.104b) 


The reader should prove that if the operated matrix (in this case F, T%,,) 


is a tensor, the end matrix (in this case F”, T%,”) is also a tensor with a 
reshuffle of one of its indices. 

The reader may have found out by now that in either the lowering or in 
the raising operation the time component does not change, whereas the 
space components change their signs only. For example, 


if At = (A?, A), K,=(Ko,K), 
- (7.105) 
then A, = (AY,—A), K” = (Ko,—K). ! 


We have provided a few examples of raising and lowering in Appendix 
A.2 

There are three other ways of creating new tensors, that do not involve 
the metric tensor directly. We shall briefly explain them. 


Multiplication, division, contraction, scalar product 


The “tensor product” of two tensors A and B of ranks m and n, respectively, 
is a tensor C = AB of rank m + n. For example, if AF and Bs are two 
tensors, then 


yy a — yw pa anai 
C 0.8 = B, 8 (7.106) 


is a tensor of rank 5, having three contravariant and two covariant indices. 
This is the Product Theorem. 

The inverse of multiplication is quotient. If C = AB and C and B are 
tensors of ranks m and n, and m > n, the quotient A = C/B is a tensor of rank 
(m — n). For example, if the relationship (7.106) is known to hold, and if 
c” “s and B;" s have been confirmed to possess the desired tensor properties, 
then A} must be a tensor of rank 2. This is the Quotient Theorem. 

If F is a tensor of rank n = 2, and if it has atleast one contravariant and 
at least one covariant index, then by equating one contravariant index to one 
covariant index and summing over it, one gets a contracted tensor of rank n 
— 2. For example, if F**,, is a tensor, then 


F% = F% (7.107) 


is a tensor of rank 2. This is the Contraction Theorem. The reader must 
prove the above three theorems. 

Using a combination of lowering (or raising), tensor product and 
contraction operations, we shall obtain the scalar product of two 
contravariant (or two covariant) vectors. Let A = (4), B =(B") be two 
contravariant vectors. Using the lowering operation, product operation and 
contraction, we get 


Tt = AY BY; 4TH = A"B, >T", = A"B,. (7.108) 


We shall call A“B, the scalar product of the two contravariant vectors A and 
E, and write this as 
A . B = AB, = A'B? — A'B! — A? B? — A? B? = A'B? — A - B. 


(7.109) 


The object A - B is a tensor of rank zero, having been formed by 
contraction of a tensor of rank 2. Therefore, it is a 4-scalar. Its value will be 
the same in all inertial frames. Since two inertial frames are connected by a 
Lorentz transformation, a 4-scalar is also called Lorentz invariant. 

In particular, if we take the scalar product of a 4-vector, contravari-ant 
or covariant, with itself, we get the square of the magnitude of the 4-vector, 


often called the norm of the vector. For example, the norm of the 
contravariant vector A is 


(W\2 — Ap A02 _ faly\2_ ¢ 4g2\2__ ¢ 43\2 _ ¢ 40\2 2 f} ’ 
(A)* = A” A, = (A) — (A’)* — (A) — (A?) = (A )* — A“. (7.110) 


The norm of a 4-vector is Lorentz invariant. 


The gradient and the d’Alembartian operator 


We had introduced the 4-gradient operator V,, in Eq. (7.72), as a vehicle 
which itself is not a tensor, but changes a 4-scalar field into a covariant 4- 
vector field. Using “index gymnastics”, we shall obtain the contravariant 
form V” and the contracted form V,,V" of the same gradient operator, as all 
of these three forms will have important applications in the remaining 
chapters of this book. 

It follows from (7.72) that 


ç Oo Oo 0 0 10 dO Od ð 
E \ Ax.’ Ar!’ Ax?’ r3) (ct Ax’ Oy’ Oz)’ 


- 8 O ð ð 10 O 0 oO 
VE = Ve = (oa oe as ] = (a eS - - = | 
Ox" Ox! Oxr- Ər? c Ot OF Oy öz 


7 1 8 92 92 9? 1 8? A 
O =V,V” = (sia z-a- 7) = (a5 -v?] , (7.111) 
, CG UL 


where V?= 2;+2++ is the familiar Laplacian operator, and the 
operator O° is called d’Alembertian operator. It is then obvious that if ®(x) 
is a 4-scalar field, then (i) V,,P(x) is a covariant 4-vector field, (ii) V’®(x) is 
a contravariant 4-vector field, and (iii) O°®(x) is a 4-scalar field. 

For future convenience, we shall rewrite Eqs. (7.111a) and (7.111b) as 


Ẹ =(V,) = (Vo, Vi, V2, Vs) = (<= a — 


côt’ Ax’ Dy’ Az 
1 ð - 
côt j’ 


(7.112) 
_{ù{/10 ö oO oO 


“a detailed exposition of the original paper of Minkowski can be found in [41]. 


The transformation of coordinates by rotation about the Z-axis [23]. 


“One has to be careful before applying this transformation rule to curvilinear coordinates. A 
distinction between the “physical components” and tensor components has to be made, bringing in 
the scale factors before making any serious application, say, in mechanics. 


Goldstein, op. cit., p. 147ff. 
«Metric as the foundation of all”—p. 304 of Ref. [5] 


Chapter 8 


Four Vectors of Relativistic 
Mechanics 


Chapter 4 outlined the salient concepts and formulas of Relativistic 
Mechanics. However, the style and format were non-relativistic, in the 
sense that the kinematic and dynamical quantities — like velocity, 
momentum and force — were written as 3-vectors, whereas true spirit of 
relativity would require their expressions as 4-vectors. This requirement 
will be justified in Chapter 11 in the context of the Principle of Covariance. 
Our immediate concern is now to find 4-vectors which will satisfactorily 
represent the above quantities. We shall sometimes (but not always) follow 
the convention set at the beginning of Sec. 7.8, in particular in Eq. (7.65), 
i.e. first write it as a geometrical object with no reference to its components, 
followed by a specific reference to it as a 4-component object. 

Let us be specific that the 4-vectors of relativistic mechanics that we are 
going to construct and deploy for our purpose are contravariant vectors, e.g. 
4-displacement 4T, 4-velocity V, 4-acceleration A, 4-momentum FP , 4-force 
È. Our immediate purpose is to build up the equation of motion in the 
Minkowski space-time M7‘, and then use them in the larger objective of 
constructing the energy-momentum stress tensor in Chapter 12. The above 
4-vectors will play a key role in achieving this primary objective. 


8.1. 4-Displacement 


Consider once again the motion of a point particle in Fig. 8.1. In Fig. 8.1(a), 
we have depicted its physical trajectory C in E’, the Euclidean 3-space. 
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Fig. 8.1 (a) Trajectory of a particle in p. (b) world line in Mí. 


In Fig. 8.1(b), we have shown its world line T in the four-dimensional 
Minkowski space M4, suppressing the Z-axis. 

P(x, y, z) and Q(x + dx, y + dy, z + dz) are two infinitesimally close 
points on the trajectory C, reached by the particle at times t and t + dt, 
respectively. The coordinates of these events are 


Op = (2) = (ct,r) = (ct, 2, y. 2), 


Oo = (x" + dr” ) = (ct + cdt,r + dr) = (ct+cdt,r+dr,y+ dy,z + dz). 


The infinitesimal 4-displacement from Op to Og 


dy = (dxr“) = (dx?, dxr', dr?, dr?) = (cdt, dx, dx?, dx?) (8.1) 


is the “primordial” contravariant 4-vector from which all other “truly” 
contravariant 4-vectors (i.e. those which have not been converted by the 
“raising” operation) by multiplication with scalars and differentiation (cf. 
Sec. 7.5). 

We shall define the basis vectors along the {cT, X, Y, Z} axes a; = 
(@& e, e},&), in terms of which we shall write the above displacement 4- 


vector as 


oe j i 
dr = e,, dx” = (cdt, dr) 
(8.2) 
where dr = displacement 3-vector from P to Q. 


The time interval dt, which was treated as a scalar while obtaining the v, 


a, p and F vectors in Eq. (7.31), is no longer a scalar in the M4 context, 
because the “time interval” transforms under an LT. The quantity which 
should now replace dr is dr, the “proper time” interval between the events 
Op and Og. It is then imperative that we should establish the 4-scalar nature 


of dt. 


The norm of the 4-displacement dx” is, according to Eqs. (7.110) and 
(7.90): 


ds? = c* dt? — dr*. (8.3) 


and is therefore a 4-scalar. In the instantaneous rest frame of the particle, dt 
= dt, and dr = 0. Therefore, 


ds? = c* dr”, so that dr = ds/c. (8.4) 
Since ds is a 4-scalar, and c is a universal constant, dt is a 4-scalar. (QED) 


8.2. 4-Velocity 


We shall now define the 4-velocity ẹ = (v") of the particle at the event (x") 
as the following contravariant vector: 


dT cdt T) cdt dx dy dz ) foes 
— = | —, — | = | —, —, —, — |. (8.5) 
dr dr dr dr dr dr dr l 


Also, as we explained in Eq. (4.37), 


Ve—xa? 


d 
so that —=lT—. (8.6) 
dr dt 


Therefore, 


‘dt dr 
V = 8&2 Ve =T keri T) (8.Ta) 
dt ` dt 


= (Te,Tv) (8.7b) 


= ['(C Us; Uy, Vs). (8.7c) 


To avoid confusion in future, we shall denote the Cartesian components 
of the velocity 3-vectors in lower case, and with subscripts 1,2,3 to mean x, 
y, Zz. We shall rewrite (8.7) as 


V = 6, V” = (Ic.I'v) = T (c, v, v2, v3). (8.8) 
p 1, V2 


What is the magnitude of the 4-velocity 7? As suggested through Eq. 
(7.110), the quantity that comes closest to “magnitude” is “norm” which in 
this case is 

V.V- oe 


í a'k ia ~2 7 5 yar 2 92, 2, 2) jon 
l = (VP = V°Va =T}? (2 -—v*) =r (1- v/e? = c. (8.9) 


Therefore, the magnitude of any velocity is 


i) —-> 
yí V)2 =c. (8.10) 


All 4-velocities have the same “magnitude”, equal to the speed of light c. 
Mount Everest and a 1 GeV proton move with the same absolute magnitude 
of speed. The reason is not difficult to find. You may be the fastest globe 
trotter. Yet when you view the world from your jet plane, you find yourself 
at rest, and the rest of the world, inducing Mount Everest, moving at jet 
speed. Since motion is relative, it is not possible to make an absolute 
judgement of which is slow and which is fast. Therefore, all velocities, 
“big”, “small” even “zero,” have the same absolute magnitude. Relativity is 
a great equalizer. 


8.3. 4-Acceleration 


The 4-acceleration A is a contravariant vector defined as X. We shall write 
it more explicitly as follows: 


dr ’ (8.11) 


"H 2 2m A2 2, 
or A= 5} a" = 53 (5 j- (5 dz d y d >). 


— Ce, =m, m, 
dr dr?’ dr? dr? dr? 


It is easy to see that the 4-vector A is “orthogonal” to the 4-vector V: 
A ` V = 0 (8.12) 


To see this take the proper-time-derivative of both sides of (8.9). The right 
side reduces to 0. The left-hand side becomes (V -V)=2V A. 


(QED) 


We shall go a little bit deeper and examine in what way the 4- 
acceleration A is connected with the 3-acceleration 
def dv 


az=—=vy, (8.13) 
dt 


as in N.R. mechanics. In the equations below, as above, “dot” () would 
mean 4+. Referring to Eq. (8.7b) for differentiation, 


xt ; : 9 
= rd =T$[(e, v)] =IT(e, v) + T?(0, a), 


ar 


(8.14) 
or A —T(Iec, fv +Ta). 
We shall now use (8.12) to eliminate I. 
(le, fv +Ta) -T(e, v) = 0, 
or T?(P'c? — (Èv -v +Ta-v)] =0, 
(8.15) 


or l(c? —v*)—Ta-v=0. 

p r3 
Hence, using (4.15) T = — (a - v). 
c2 


Using the above result, we can now rewrite (8.14) in the more useful form: 


é 


I? 9 
A = (= -V), — (a -vV)v + ra) è (8.16) 


8.4. 4-Momentum, or En-Mentum 


Multiplying 4-velocity V with the rest mass m, of the particle we get the 4- 
momentum P of the particle, which is an important member of the 


contravariant family. 


Fi 4 l Ia aÀ 
P = m, V =m [ (c, v) =m, T(e, Vz, vy, vz). (8.17) 


d 


According to Eqs. (4.45), (4.46), (4.79) 
m = relativistic mass = T m,. (8.18a) 
p = relativistic momentum = [m,v = mv. (8.18b) 


E = total energy = Tm, = me. (8.18c) 


Note that we have used two different symbols for energy, namely E in 
Chapter 4, and now £ in the present chapter. The reason is that we shall 
prefer to reserve “E” for the magnitude of the electric field, as we are going 
to take up the covariant equations of electrodynamics in Chapter 11. 

It follows from (8.17) and (8.18) that the time component of the 4- 
momentum is energy (divided by c). Therefore, 


E — 
P = e p" = (=. p) = En-Mentum. (8.19) 
P 


Energy is thus integrated into momentum into a single 4-vector, the 4- 
momentum, in which energy is the zeroth component, i.e. the time 
component, followed by three space components. It will be easier to 
remember the constituents of this 4-vector, and their ordering, if we give the 
4-momentum an alternative and generic name En-Mentum. 

There is a more important justification. The name 4-vector, or 4- 
momentum, makes us expect four components in the 4-vector involved. 
Most often this is not true. There can be 2-components, or 3-components 
when the motion is one-dimensional, or two-dimensional. The name En- 
Mentum will steer clear of this confusion. 


This fusion of two distinct quantities, energy and momentum, into a 
single entity En-Mentum was expected, if not overdue, following our 
awakening to the fact that the energy and momentum conservation laws 
could not be separated into a conservation of one of them without the other, 
as we Saw in Sec. 4.6. Either we get the conservation of the united En- 
Mentum or no conservation at all. Read our observation following Eq. 
(4.92). We can therefore interpret the energy-momentum transformation 
equations (4.89) and (4.90) as the transformation of En-Mentum, which we 
rewrite here as the transformation of its time-space components: 


£” £ 
O ané 0 mlj _, ERPE B 2 (2 Iq) 
p =%9(p — Bp ) > — =) (= = Ae) (s.20a) 
m r 
=) ee om | 2,.0\ E fos \ 
p = yp — 8p ) > pi = Pz — 8— ], (8.20b) 
P 
p°? = p*; p? = p? = Pp, = Py; p, = p,. (8.20c) 


Suppose in the frame S we have pure energy €, but no momentum (we 
can think of a ball of radiation, propagating isotropically in all directions). 
Thanks to € = mc’, this radiation will have a rest mass mọ = €/c?. According 
to (8.20b), this “ball” will have a momentum p, = —ye%m,, same as that of a 
particle of rest mass mọ, moving with the velocity —Gc, in agreement with 
(8.18b). Pure energy in one frame transforms into a combination of energy 
and momentum in another. 

It is seen from (8.18c) that the energy € of a particle is proportional to 
the Lorentz factor T, which can be used as a representative of the energy. 
From (8.20a), we find its transformation from S to S' under the boost: S(p, 
0, 0)S': 


I’ = r (1 — Bv,/c). (8.21) 


The magnitude of 4-momentum (its “norm”) has a special significance. 
By Eas. (8.9) and (8.17) 


—> 9 5 r\9? 9 9 r \ 
(IP) = m., (VY =m., (8.22) 


so that the magnitude of 4-momentum is m,c. However, from Eq. (8.19) 


(P)? = (P —p? = — —- p°. (8.23) 
2 
Equating the right sides of the above two equations, we rediscover the 
energy-momentum relation given earlier in Eq. (4.83): 
E£? — ( mc j? + p2c?. (8.24) 
For a massless particle, e.g. photon, the above equation yields 


E = |plc, (8.25) 


as in (4.84). We shall therefore specialize the definition of 4-momentum 
given in (8.19) for a photon: 


J E E ; ; 
P =o, p" = (=. <n) for a photon, (8.26) 


where n is the direction of propagation of the photon. For a photon, 
(P —0 (8.27) 


so that the En-Mentum of a photon is a null vector. 


8.5. 4-Force, or Pow-Force and Minkowski’s Equation of 
Motion of a Point Particle 


The structure of 4-force È should be such that it will recast Newton’s 
equation of motion F = % in E° to Minkowski’ equation of motion (to be 
referred to as EoM frequently) in M4 in the form 


iP LF (8.28) 


dT E 


The 4-force È that appears on the right is known as Minkowski force. 
However, we may like to call it Pow-Force, for reasons that will follow. Let 
us write 


F =e, J” =(F°,F). (8.29) 


The EoM (8.28) combined with the definition of P given in (8.18), (8.19) 
means that 


-—=-l—= I 


1 dE 1 dé aut = Fo (8.30a) 
c dr c dt c i Daaa 


dp = „dp =F, (8.30b) 


dt dt 


where II (capital pi) = ©. It stands for the power received by the particle 
(same as energy received by the particle per unit time), due to (i) work done 
on it by external forces, and/or (ii) by absorption of radiation or heat 
(thereby changing its rest mass). In the case of a particle whose rest mass 
does not change, IT is the same as the power delivered by the force F; cf. 
Eq. (4.81). 

According to Newton’s second law of motion, Eq. (4.47), Ẹ = F. Hence 
the space component of F is identified as TF, and the time component as 
F° =TTl/e. Then 


F =T (=. F ) = Pow-Force. (8.31) 


Since the time component of ¥ is Power (multiplied by T/c), it may be 
easier to remember the constituents of this 4-vector, and their ordering, by 
the alternative and generic name Pow-Force. 

Since 4 =r 4, Minkowski’s EoM (8.28) gets resolved into the following 
time and space components: 


> 


d i 
Time component: 7 II. (8.32a) 
(i 


. dp 
Space component: a F. (8.32b) 
í 


The space component of Minkowski’s equation of motion is a 
restatement of Newton’s second law of motion. The corresponding time 
component expresses conservation of energy. It simply states that the rate of 


change of the kinetic energy of a particle equals the power delivered by the 
external agents (e.g. forces) acting on it. 

Let us consider an important application of the EoM (8.28) to a point 
particle with a constant rest mass mọ. Due to (8.17), then the above EoM 


takes the alternative form 
m Å = F. (8.33) 


The above equation shows that this 4-force is orthogonal to 4-velocity, a 
fact that follows from (8.12): 


F.V =0. (8.34) 


Therefore, the four components of ¥ are not independent. We shall fix the 
time component using (8.7), (8.31) and (8.34). 


(r=) (Te) —(TF)-(['v)=0 => II=F.-v. (8.35) 


Hence, for a particle with constant rest mass the 4-force must have the 
form: 


F=r(—r), (8.36) 


Now the equation of motion (8.28) in M^ will have the time and space 
components: 


~ 


Time component: T = F - v. (8.37a) 
fi 


dp 


Space component £P =F. (8.37b) 
dt 


8.6. Force on a Particle with a Variable Rest Mass 


Let us now be more general and consider a 4-force that can change the rest 
mass of the particle. We shall then write 


= => => 
F tot —F+K. (8.38) 


where ¥ is the force that does not alter the mass of the particle, and is 
orthogonal to the 4-velocity. It has the same properties as in (8.33) — (8.37). 
x can be called a convective force, as it can alter the mass of the particle. 
The time-space components of % can be written as 


K =I (=x) in S; K = (=.x.) inS, (8.39a) 


Kot: V HRV ek v= i, (8.39b) 


dm, 9 
—;* (8.39¢c) 
dr 

is the convective power absorbed in S, and m, is the proper mass of the 


particle. We shall illustrate this case with the example of a Relativistic 
Rocket in Sec. 9. 


8.7. Lorentz Transformation of the 4-Vectors of Relativistic 
Mechanics 

We shall apply Lorentz transformation to the 4-vectors of Relativistic 

Mechanics and from the resulting relations extract the transformation 


formulas of the corresponding 3-vectors, and make sure that they agree with 
the transformations obtained in Sec. 4.1 and 4.2. 


8.7.1. LT of 4-velocity 


We shall apply the general transformation given in Eq. (7.78), to the 
velocity 4-vector 7 shown in (8.7b). We shall rewrite the same as 


V =[c(1,v). (8.40) 


where v = v/c as in (4.22), so that the quantities inside the brackets are 
dimensionless: 


(7.78a) > [Y= yri- B-r), (S.41a) 
(7.78c)> Tv’ =T[(yv,+v1)—- y8], (8.41b) 
(VL j- y8 
a. p= Tow +v) -78 (8.41c) 
IT’ 
(yw, +Vv1L)-—y¥8 
using (8.4la), vi +r =v) = = (8.41d) 
j . = y(1-—B-v) 
: vy, —B 
from which v'i et ae (8.41e) 
(1-—G-v) 
V i 
and v’! = ———— (8.41f) 


E y(1—B-v) 


We get back (4.23a) for the special case of x-orientation. 
Equation (8.41a) will be found useful in many applications. Hence, we 
shall write it, and its inverse, as two separate equations: 


I’=r(1-8-v), (8.42a) 
r=91'(1+68-v’). (8.42b) 


8.7.2. LT of 4-acceleration 


The components of A are given in (8.16). To make the formulas less 
complicated we shall specialize the expression for A to rectilinear motion, 
along the X-axis, and get the following expression for A, having only t and 
x components: 


2? 9 


= 


m > (T* 
+I“a=T^Ã“a {= + i} — Ta. (8.43) 
“a 


, Tw [av 
0 ai 1 . 
= it a 


(i 
In short, 


Å =Tta (= 1) (8.44) 


where a is the acceleration along the X-axis. We shall now let A > A, and 
apply the transformation (7.77a). 


i p v’ v 
A? = 4(A° — BA!) => I*a’— = Tta (- — B) i (8.45a) 
P 


A" = 4(A! — BA?) > Ta’ = Tta (1 — B=) . (8.45b) 


Let us now take the S' frame to be the IRF, so that 


v’ = 0. I’ =1. v = Bc. A =T. 


Now go back to (8.45). Equation (8.45a) yields 0 = 0. Equation (8.45b) 
gives the acceleration in the IRF, also called proper acceleration,as 


a’ = I5a(1 — 8?) = Ta, (8.46) 


same result as in (4.29). 


8.7.3. LT of 4-momentum 


We shall take the same steps as for 4-velocity. Recall the time and space 
components of 4-momentum as given in (8.19): 


E -o o j 
P = (=. p) , E=Tmc, p=—Im,y. (8.47) 
5 


Apply the general transformation given in Eq. (7.78) to these components. 
First the time component. From (7.78a) 


E' E 
— — ^) (=-3-p), 
Cc G 
or I’m.c = ¥(['m.c — 8B -Tm,v), (8.48) 
orI’ =(1-8.-.vn). 
For one-dimensional motion, 


I’ =r- 8v), 


— A fy } ay (8.49) 
r = yT (1 + 5V ). 


We get back Eq. (8.42). Now the space component. From (7.78e) 


p’ =p+n |(y—l1)(n-p)—-y8-]|. (8.50) 
€ 


We shall get back Eq. (4.23) if we use the definitions of € and p as given in 
(8.47). 

The inverse transformation, corresponding to the boost: [S'(P)S} is 
obtained by replacing n with —n: 


E 
p=p’+n|(y-1)(n-p’)+78—]. (8.51) 
z 


8.7.4. LT of 4-force 


The components of the 4-force are given in (8.31) which we rewrite here 
F-=r(2r). (8.52) 


Here F is the 3-force as defined by Newton’s second law of motion Ẹ =F. 
Apply the general transformation given in Eq. (7.78) to these components. 
First the time component. From (7.78a) 

ee (—-<-rF), 


C 


yT 
or lr’ = Pill —v-F). (8.53) 
nine (0.2) = 

1 FE B . V 


Now the space component. From (7.78e) 
I’F’ = TF+n | (y-—1)(n- TF) -y pel , 
P 
(8.54) 
F+n [a — 1)\(n-F)— y8 £] 


using (8.42), F = - 
y(1-G-V). 


For a particle moving with constant rest mass, I = F - v = F - cv. In this 
case 


y1-G-v) 


The inverse transformation, corresponding to the boost: {S'(B)S}, is 
obtained by replacing n with ~-n: 


_F+n([(y—-1)(n-F!))+78F'-v'| 


F y(1+8-v') 


(8.56) 


Corollaries of the force transformation formulas: 


(1) A longitudinal force F) (i.e. a force in the direction of the boost 
velocity cB) is invariant under LT. 
Proof: Let S be the rest frame of the particle. Hence, v’ = 0. By 
assumption, n(n - F’) = F’. Hence, from (8.55) 
Fy = Fy. (8.57) 
(QED) 
(2) A transverse force in the rest frame S' of the particle transforms to the 


in the Lab frame S. 
Proof: By assumption n- F' = 0. By a similar argument 


F, =—. (8.58) 


(QED) 


Fig. 8.2. Longitudinal and transverse force and area cross-section in the rest frame. 


(3) Pressure p in a perfect fluid is Lorentz invariant. 

Proof: By perfect fluid we mean a fluid in which there is no shear 
stress. If we consider an infinitesimal rectangular box inside the fluid 
(Fig. 8.2), the stress on all the bounding surfaces will be the same 
normal compressive stress p, called pressure. 

Let the pressure inside the fluid at a certain event ©, in the 
corresponding instantaneous rest frame S', be p’. Referring to Fig. 8.2, 
the force on the right face is ôF; and the area is ôA}. Similarly, the force 


* . EEI s car , ôF! 6F! 
on the top face is ôF; and the area is $4). Hence, p = $+ = +. 


Apply Rules 3, 4 of Sec. 2.7 to the above surface areas. The 
longitudinal dimension contracts, but the transverse dimensions do not. 
Hence, ôA, + Ay = A/V; SAL > Az = OAL. If we now apply (8.57) and 
(8.58), we get 


OF, ôF”! , 
D, =—_— = - =p. 
"u= JA, SAL 
SF, SFY __, 
Dy = = = TT n. 
Py = 5A, SA jy ? 
Hence, p(x) = p(x) = p, (x) = p' (x). (8.59) 
I Px Py I | 


(QED) 


8.8. Conservation of 4-Momentum of a System of Particles 


8.8.1. Zero momentum frame, equivalence of £ and mass 


Consider a system {mj,v', Pt = T‘mi(e,v'); i = of N non-interacting particles, 
their individual rest masses, velocities and En-Menta being 
[mi vi, pi = T'mife,v'); i =1,2,..., N}, with respect to some inertial frame 


S. The total energy, total Momentum and total En-Mentum of this system 
are as follows: 


E= y imie, P = 2 miv’, P = (£ /c, P). (8.60) 
i=1 i=l 


In performing the summation in the above equation, we count each of 
the N particles at the same instant of time, i.e. simultaneously.On the space- 
time diagram the counting events lie on one XYZ-hyperplane. 

Let us consider the boost: {S' > (nf) > S} , and apply the Lorentz 
transformation formula (7.79a) and (7.79e) to the En-Mentum P: 


E/e=) [E'/e +B. P’), (8.61a) 


P = P’+n{(y—1)(n- P’) + y8(E'/c)]. (8.61b) 


The above transformation equations are valid, a priori, for the En-Menta 
(c' /c, pù of individual particles referred to in (8.60). However, due to the 
linear character of the transformation equations, they are also valid for the 
total En-Mentum. The 4-vectors (€/c, p), (€'/c, p’), in Eq. (8.61) now stand 
for the total En-Mentum 4-vector in S and S’, respectively. 

For obtaining ¢’, P' in the above equation, we perform the summation 
by the same procedure as for Eq. (8.60). However, in this case the counting 
events are simultaneous in S', but not so in S. This should not cause any 
problem, because the particles being non-interactive, and under no external 
forces, keep their individual En-Menta constant as they progress along their 
respective world lines. 

What happens when two of them collide. The collision of two particles 
is one event, a single point on the space-time diagram — same point 
whether in S or in S', although their coordinates will differ. 


In Sec. 4.9 we have presented an example of elastic collision between a 
photon and an electron, known as Compton scattering. The En-Mentum is 
conserved, and the CM of the system keeps moving on as a single particle 
along its world line. 

Suppose the collision is perfectly inelastic, i.e. they coalesce and form a 
new particle C. Even then the En-Mentum of C will be the sum of the En- 
Menta of A+B, just after collision. That is, Ec = E4 + EB; pe = pat+pe. 
However, in this case part of the energy €¢ goes into the inner energy of the 
coalesced particle C, say in the form of “excitation energy”. Problem 4.6 
gives a good example of this process. In this case also the CM keeps 
moving on as a single particle, just after the reaction. 

Let us go back to (8.61). Of all frames S' there will be one in which P' = 
0. We shall identify this frame as Ses and call it the rest frame of the 
system $, or better still the zero momentum frame (ZMF or ZM frame) of 
the system. This is the analogue of centre of mass frame of non-relativistic 
classical mechanics. 

We go back to Eq. (8.61), set P’ = 0, denote the energy in Sest as €° and 
obtain: 


E = 7E, (8.62a) 
P = ny 3 Er /e) = Y vi Eg’ /c? ) =" vMo = CvMo, ( 8.621 ) ) 

“0, 2 . ` ` ~ 

where, Mao = €-/c* = rest mass of the system $ in the frame S. 
(8.62c) 


This brings us to the following conclusion: 


Conclusion : The system of particles can be considered to be a single 
particle at rest in Sese and moving with the boost velocity v with respect to 
S. It has a rest energy Eo, rest mass Mọ, and the two are related by the 
relation £9 = Moc. Its momentum P as shown in (8.62b), is in accordance 
with the definition of the momentum of a single particle as given in (4.44). 

We have replaced the Lorentz factor y associated with the LT, with the 
dynamic Lorentz factor I associated with the motion of a moving particle 
(see Eq. (4.38)). In this case, the particle velocity is identical with the boost 
velocity. 


Thus the inner kinetic energy T, of the system contributes to the inertial 
mass of the system by an amount equal to To/c?. 
We shall illustrate the concepts with a few examples. 


8.9. Illustrative Numerical Examples ITI 


We shall now specialize our discussion to a system in which there are two 
particles before a collision, and two or more particles after the collision. 
The collision may result in the production of a new system of particles 
which may or may not bear any resemblance with the original ones. One 
such example is the production of a pi-meson like 2°, 1* from a nucleon- 
nucleon collision (p + p, p + n), another can be the example of the creation 
of an electron—positron pair when a y-ray hits an electron. In such cases, the 
En-Mentum of the system may not have the simplest expression as given in 
(8.17) in which the rest mass my and the dynamic Lorentz factor have 
been delineated clearly. 

Even when the En-Mentum looks different, we can still filter out the 
rest mass and the dynamic Lorentz factor by recasting the expression 
suitably, as in the following example: 


- c Ce De 
P = (A, B,C, D) = m, (—) (c ARS — . (8.63) 


mC 
This is in the form of Eq. (8.17), if we identify 


—— = Í |; — = Uzr; etc. (8.64) 


8.9.1. Example 1: Relativistic billiard balls 


The example we shall now discuss, and work out the details, will clarify the 
notion of the ZM frame, in particular how to find this frame, and illustrate 
how this intermediary will help us find a solution to the problem. 

Figure 8.3(a) shows two identical billiard balls A and B of equal rest 
mass mọ. B is at rest, and A is approaching B along the X-axis with 
relativistic velocity ¢”ain=¢vain®z- The dynamic Lorentz factor 
corresponding this velocity is Pain =1/\/1—vz,,,. After the collision, A and 


B bounce out with velocities ¢”a-cur and ©”B-out, respectively. Our task is to 
calculate these velocities and the angles 8 and @ they make with the X-axis. 
Figure 8.3(b) shows the billiard ball experiment from the ZM frame. 
Due to symmetry both balls approach each other and then fly apart with the 
same speed cz’, making angle 6’ with the X-axis. The dynamic Lorentz 
factor for both particles is T’ = 1/v1 =v”. 
We shall proceed towards the answers in several steps. 


Step 1: Find the velocity of the ZM frame with respect to the Lab frame. 
For a while we shall suppress the subscript “aj,” in Va-in, I A-in SO that 


the equations look less cumbersome. I and v in the equations below will 
mean ie and VA-in- 


v A y’ 
S) add v ka 
7% 
to vectors in S’ s] ZM frame 
to get vectors in § 


\v 
4 B-out 


Fig. 8.3 Elastic collision between two equal balls seen from S and S’. 


Let cB be the boost velocity, y = 1/,/1— 5? the boost Lorentz factor. We 
shall write the individual and total En-Menta before the collision. Since the 
encounter takes place on the XY-plane, we shall write only three 
components of En-Mentum in the format P = 6p" = (p",p',p?) (as in Eq. 
(8.17), but without p°). In the Lab frame, represented by S, the relevant 4- 
vectors are as follows: 


Pain = api an = [m.c (1, v,0), (8.65a) 
Pai = D Phin = M,C (1, 0,0), (8.65b) 
— 


Pin = Pii T Pein =m,¢c(I' + 1, Iv, 0). (8.65c) 


LT will convert Pin > Pia with components (Ps Pe: P) of which the 
first two are non-zero: 
Po = y ((T +1)— 8Tv)m,e, (8.66a) 
Pi! =y (Tv — 8 (P+ 1)) me. (8.66b) 
To identify the ZM frame we must set P't = 0 This leads to 


in 


58 = ——. (8.67) 


Going back to (8.66a), and using (8.67), 


lo (Tv)? 
Pa =7 fa +1) -— —; m.c 


me rya xr +1—1y?} 
mss i a al i 


= 2Y Mec. (8.68) 


We have used the fourth identity from Eq. (2.22). The total En-Mentum of 
the 2-particle system in the ZM frame can now be written as 


P! = 2ym,c (1,0,0). (8.69) 

Particles A and B, together representing total energy (T + 1)moc? in the 
Lab frame S (see Eq. (8.65c)), transforms into a stationary single particle 
with rest mass energy 2ym,c? in the ZM frame S’. 


We can find a relation between T and y by transforming P’ -, P, from 
the ZM frame S’ to the Lab frame S (using Eq. (7.77a) and setting p > —f). 


90 _ wr p’d ’\_ w/o. \ ,— Ja, 2 , 
P- = 7( Pa -+ BPa ) = ¥(27 + 0)m,c = 27*m, c. 


But, P9 = (T + 1)m.c, from Eq. (8.65c). (3.10) 


Hence, [ + 1 = 27°. 


We can transform this into another useful relation, using the identity y? 
-1= yh: 


T = 27° -1=°(1+ 8’). (8.71) 
Step 2: Find the En-Menta of the balls before and after the collision. 


In the ZM frame, represented by S', the relevant En-Menta are as 
follows: 


PEN 7 — —> fh — Parara >Í / \. 
(i) Pia = @, Piin = [m,e (1, v’, 0); 


ETEN mt) —_> I FA / f \ 
(ii) Phin = Cn Pain = [m.c (1, —v ', 0); 


= (8.72) 
(iii) Py out = €k Piou = [M.e (1, v" cos6’, v’ sing’); 
js i 4 < f . \ 
(iv) Ph ow = Os Pe oun = I’m,ec(1,-v' cos#’,—-v’ sing’). 


We shall now find the components of the same 4-vectors in the Lab 
frame S which is the rest frame of B. Hence, the boost velocity -p is the 
same as the velocity of B in S’, and the boost Lorentz factor y is the same as 
the dynamic Lorentz factor I”: 

B=p'; y=I" a (8.73) 


y1- 32 


— l-ve 


First, the in-coming components, by LT formulas (7.77a), with boost 
velocity —6, applied to the components shown in lines (i), (ii) of Eq. (8.72). 


PR in = (PR in + BD in) = q0’m.e(1 T 8 xv ‘) 
= 77(1+ 8?) m.c, 
Pi in = (ve, + B PR sn) — Pme” +8 x1) (8.72 a) 
= (2778) m.c, 
Pain = Pain = 0; 
Pee = (Pe in +8 wee) = 7I’m.c(1+ 8(-v’)) 
= 77(1 — 8?)m.c = m.c, 
(8.72 b) 
Phin = (Pin + BP ein) = meli + 8 x 1) =0, 
Pé-in = PB-in = 0. 
Next, the out-going components, by LT of the components shown in 
lines (iii), (iv) of Eq. (8.72): 
Pica > PR out + 8 Px out) = qI’mae (1 + Bv’ cos 6) 
= 7%? (1 + 8? cos’) m.c, 
Prout = V(Piout HEPR out) =m, cehu ' cosh + 8 x 1) f 
(8.72 c) 
= 7%? B(1 + cos’) m.c, 
Phat = PR out = lv cin? mec, 
= yß sin ð’ m,c. 
PB out = Y(PB-out j BPR out) = mec (1 + B(-v’) cos 9) 
= ¥7(1 — 8? cos’) m.c. 
PRs out = (PS out T BPE out) = yI'm,c (—v ' cos 0' T B x 1) 
(8.72 d) 
= 7%? B(1 — cos’) mec. 
Pout = Pout 


= —78 siné@’m,c. 


Compactly, 


23 
Brin = MoC (y7(1 + B?), 2778, 0) = m,cy7(1 + 82) (1 N: zA 0) x 


Pein = m.c (1, 0, 0), 


Tina = m,e (y7(1 + 8? cos"), y78(1 + cos’), 78sin 6’) 


xB sf! 3 sin A 
= m,cy"(1 + 8? cos 6’) (i yB(1 + cosô ) 8siné ) 


y(1 + 8? cos4’)" y(1 + 8? cos 6’) 


Prom = m.c (y7(1 — 8? cos’), y78(1 — cos’), —y8sin6’) 


+ B(1 — cos@’) B sing’ 

2 32 ^f 1, 22 - 

= m.cy*(1 — 84 cos TAn LRAD 

m,cy (1 — 8° cos) (1. y(1— B2cos@’)’ y(1+ 8? cos0') Teen) 
(8.7 


Going back to Eqs. (8.74) and (8.75), 


Pi. = Paa + Pai = 2%?°m,c(1, 8, 0), 


Bout = P aout + Peot = 277m,c(1, 8 ; 0), 


— 
oo 
~] 
(ez 

— 


confirming that 7, = P.u in the frame S. 
Step 3: Find the velocities of the balls before and after the collision. 


Going back to Eqs. (8.74) and (8.75), 


28 
— vw 2/ Q2\. f = “M . ey, 

Pain = Y“ (1+ 8"); Yin = Ta g Tga’ (8.77a) 
yB(1 + cos’) l 
Ps os = ~2(1 B? cos@’ ° Vh-outx = Lo ool Mccall SIF 8.77b 
A-out = Y°(1 + s0); Va-out 7(1 + 820056") (8.77b) 
8 sing (8.770 

H =>’ 7 
VA-out-y y(1 + B2 COB 6") ’ 5. 7c) 

78(1 — cos@’) 

: msi 22 oN. a MAN Jy. > pry J) 
| hs ee = AY (1 —_ B cos a); VB-out-x = >(1 — pic” (8.7 7d) 
— sin @’ Sees 
VB-out-y = Oo. (8.77e) 


y(1 — 8? cos’) 


We can now find the angles 6 and @: 


VA -out-y sin 6’ 1 A! o 
tan? = — l = —_—_—_—_ = —tan—, (S./Sa) 

VA -out-x y(1 + cos 6’) f 2 

VB-out-y sin g 1 6’ os 
tan ¢ = —— = ———- = - cot =, (8.78b) 

VB-out-x y(1 — cos ĝ') N 2 


(8.78c) 


tan tan ọọ = —. 


Note that the above relationship between 8 and 6’ on the one hand, and @ 
and 6’ on the other is consistent with the angle transformation formula 
given in Eq. (4.133). The exercise to confirm this is left to the reader. 


We can now use Eq. (8.77a) to express y* in terms of I',-in using the 
identity y? - 1 = By’: 
I A-in = 277 =l 
ns T A-in tA 


a > ’ (8.79) 


1 
where Pita = == 
1 


we VA in 
We have now a more convenient equation for the angles in terms of the 
velocity of the impinging particle A. 


») 


tan@tan ¢ = ———. (8.80) 
T A-in + 1 


The reader should work out the exercise R1 in Sec. 8.1 to get a full 
appreciation of the equations presented in this section. 


8.9.2. Example 2: Threshold energy for a p + p collision 
resulting in the production of a n? particle 
We shall consider the following example’*: 


Pi + P2 > pi + Po + T”. 


The left side represents colliding protons # 1 and # 2, before collision, 
and the right side the same protons after collision. The proton p> represents 
a stationary target (hydrogen atom), and the proton pı the bombarding 
particle. There is a certain minimum kinetic energy Ko, called the threshold 
energy, above which the above reaction can take place. We shall determine 
this energy, using the ZM frame as a stepping stone. 

We have shown the reaction in Fig. 8.4. In the Lab frame, the target 
proton p> is stationary before the collision. In the ZM frame, the target 
proton p> and the bombarding proton p4 are moving with opposite velocities 
Bc and -fc before the collision and settle down to rest along with the new 
born 7° after the collision. 

We shall adopt the following rest mass values: m, = 135 MeV, m, = 
938.26 MeV. Now apply energy conservation in the ZM frame: 


2 Mp = 2m, + My, 


. mM, 135.0 _ 
or. rT = 1+=——= oT 1.0719. 
a á X 398.20 (8.81) 
Ne W, 82 =] — T = 0.13. 
Hence, B = 0.36. 
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| A Lab frame (s ZM frame 
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Fig.8.4 Nuclear reaction p +p > p+pt n? seen from two frames. 


To get the velocity fo of the bombarding particle pı in the Lab frame, we 
shall apply the velocity transformation formula (4.7), to go from the S’ 
(ZM) frame to the S (Lab) frame. 


8+ 8 2 x 0.36 


= —qK = oe = 0.637. (8.82) 
1+65x6 1+ 0.36 x 0.36 


_= 
Do 


The corresponding dynamic Lorentz factor is 


l l 
V1- v1- 0.637? 


The kinetic energy of p4 should be 
Ko = (To — 1)mpc? = 0.3 x 936.26 = 281.478 MeV. (8.84) 


This much is the minimum kinetic energy to be given to the bombarding 
proton to produce a mass which is less than half this value. The remaining 
half is shared by the three particles emerging from the reaction. 


8.9.3. Example 3: Threshold energy for a photon hitting an 
electron to produce an electron—positron pair 


We shall consider an example of (e, e_) pair production by photons”: 
NEE >e pet +e 


This example is similar to Example 2, except that a gamma ray, which is 
a particle of zero rest mass, now plays the role of the bombarding particle. 
We have made a picture of this reaction in Fig. 8.5. 

Let (£o, €) represent the photon energy in the ZM frame and the Lab 
frame, respectively, and similarly (To, F) the dynamic Lorentz factor of the 
electron in the same two frames, and let m be the rest mass of the electron. 
Let -pọ be the electron momentum in ZM. Now apply the conservation 
equations in the ZM frame: 
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Fig.8.5 Nuclear reactiony+e >e +e +e seen from two frames. 
Momentum : Eo/e = po = To 8me. 


Energy : E&o + Tome? = 3mce?. 


Hence, 3 — Io =To8., 
1+8 (8.85) 
r Toll 5) = —— = 3 
4 3 5 
from which 8 = —; lo = T= z 
5 1+4% 3 


, s ý "O 
éo = poc = To Smc“ = = X =M = -mc . 


Now consider the boost: S'(6)S from the ZM frame to the Lab frame. Apply 
the 4-momentum transformation (8.48) to the energy of the photon. In this 
case, the boost velocity p is the same as the electron velocity p in the ZM 
frame, and the boost Lorentz factor Yboost is the same as I'9.Note that we 
have added a subscript to distinguish the boost Lorentz factor from the 
gamma ray: 


, +45 4 4 4 r 5 4 9 > > (8.86) 
or é=-x|-+-x-— | mce= x x — | mc* = 4mc 
3 5 3 : 5 
= 4 x 0.511 = 2.044 MeV. 


As in the previous example, we need a photon of minimum energy 4mc? to 
produce an electron—positron pair of rest mass 2mc?. The balance 2mc? goes 


to the kinetic energies of the pair (e’, e") and the bombarding particle e” 
emerging after the impact. 


8.9.4. Example 4: Compton scattering and inverse compton 
scattering 


1. (a) Consider a collision between two particles A and B, moving with 
velocities ug and up, their respective rest masses being Moa and Mop. Their 


4-velocities are U4 and U. Show that 


=> => Pya x i 
U U, = cT(8), (8.81) 


where T(P) is the Lorentz factor corresponding to relative velocity cB 
between A and B. 

Hint: Since the quantity is Lorentz invariant, evaluate the product in the 
frame of reference in which one of the particles, say B, is at rest. Now write 
the 4-velocities in this frame. 

(b) Let their 4-momenta be P? ang P,. Show that 


=_~> => >. f ; f 
P, -B = cT (8)meeMos- (8.88) 


(c) Let the particle A represent a photon propagating with frequency vp in 
the rest frame of B. Show that 


= > 
P, +P, = hivymgs.- (8.89) 


(d) Suppose both the particles are photons. In the Lab frame, their 
frequencies are v, and vj, and the angle between their paths of propagation 
is 0. Show that 


— > h\? 
P, : P = (=) Vala (1 — cos@). (8.90) 
z 


2. Consider the elastic collision A + B > A’ + B’. Let A, A’ represent a 
photon and B, B' a subatomic charged particle (e.g. an electron, a proton) 
before and after the scattering. The photon is propagating in the directions n 
and n’ before and after collision. We write their 4-momenta, and the 
“magnitudes” of the corresponding 4-momenta, as defined in Sec. 7.9.4, 
and written in Eqs. (7.110), (8.19), (8.22), (8.26), (8.27): 


P i =f ff \ =? 9 P > 
y = (hv/e)(1,n), PL = (hv'/e)(1,n’), (PY = (PLY = 0, 


P. = m,[ (c, u), P' =m (cu), f P. \2 — | Pp )? = m2c?. 


(8.91) 


From conservation of 4-momenta: 


P, +P. =P! +P, 


"y 


or P, T P. = P! = P. 


(8.92) 


Square either side and use Eq. (8.91) 


P, P! = P. ` (P. — P' , (8.93a) 


h2vv' | Pe mh ry j tt (2 O3h) 
or, ——(l-—n-n')= — |(v —VYyj)e—-u:-(yn—-nvy )]. (8.93b) 


9 
c? 


Apply formula (8.93) to Compton scattering. Take u = 0, the initial 
velocity of the electron. Now obtain the formula (4.124), which had derived 
in Chapter 4. 


Ans. Here [= 1. Also n- n' = cos 8, where @ is the angle between the initial 
and the final directions of the photon. Inserting these values, we now get 


h2pv' 
—— (1 —cos#) =m,h(v — v'), 
ce j ` 
: (8.94) 
m hlv — v' )c? 4 Q) 
or o | — COS Q). 
h?vv' 
Now 
, FA 
C\v—-V ) Cc E 
A oe — oe = e N (8.95) 
vy! wo vV 
Hence, 
’ h — hen aaah 
A — à = —(1—cos@). (8.96) 


MoC 


We get back the same Compton scattering formula (4.124), which had 
derived in Chapter 4 using a long route. 


3. Apply formula (8.93) to inverse Compton scattering. An electron moving 
along the negative x-axis, with a high velocity u = —ue,, collides with a 


photon of energy hv proceeding along the positive X-axis. After the 
collision the photon bounces back to the negative X-axis with energy hv’. 
Find an expression for the energy of the outgoing photon 


Ans. Write the three 4-vectors involved in this problem, keeping the (time, 
x) components only. 


P. = (hv /c)(1,1); P' = (hv'/c)(1,—1); P. = M,[e(1,—u/c). (8.97) 


We shall rewrite the formula (8.93) by setting n- n' = -1, u-n=~—u,u- 
n' =u. 


h2vv' rn mh, n , hon 
—— (1 - n- n’) = —[(v —v’)e-—u- (vn—n'v’)), 
Cc Cc 
= ’ 
h*vv' mh, P F 
or 2—;— = —— |v -v Jet (v +v ju] 


c c (8.98) 


MTh, ' j a 
= —— [|(c + u)v — (c—u)v’] 
C E 


= (mJTh) [( 1 + u/c\v — (1 — ujejv'] f 


Now, in inverse Compton scattering the velocity of the incoming 
particle is very high, almost equal to the speed of light. We can therefore 
make the following approximations: 


(1+u/c) z2. 


2 


l : : 
Also, =1-—=(1-u/e\(lt+u/e) = 211 - u/c) (8.99) 
E ' 


We now go back to (8.98) and write: 


h?vv' 1 
2— Z B'_Th) |v — ——2" 
z2 Mob n) | 12 
e E vo = Tm,c? i 1 v’ 8.100) 
x — _— o — — | , (5. 
2(hv}\2 V hv AT? y 
(1 P m,¢C? ) v= Tm. 
or — | — = ——_. 
AMT hy J v hv 
Let us write the left-hand side as 
1] mc’ (AT hv E 1) (8.101) 
As = --OQ—_ =— i (od, ] 
AT hv \ m.e 
Then 
vy! Ar? i 
(8.102) 


V — ie 1) 


Moc 


We can also write this in terms of the initial and final photon energies. 


—_= ——___. (8.103) 


4. Consider the case in which the incoming electron energy is 10° MeV 
(about 2000 times its rest mass). The initial photon energy is almost the 
same as (or more than) the energy of the electron. Show that the incoming 
electron transfers almost all of its energy to the y-ray. (Taylor and Wheeler 
[6, Example 8-23, p. 270]) 


Ans. In this case I ~ 2000, so that “+” in the denominator can be ignored. 
Using this approximation in (8.103) we get the energy of the back scattered 
X-ray: 


El STM E, (8.104) 


which is the energy of the incoming electron. 


5. Take a photon of Cosmic Microwave Background, having photon energy 
hv equal to 10° eV. A proton in the Cosmic ray of kinetic energy 1014 MeV 
collides with it. Proton rest mass energy is 939 MeV. For simplicity take it 
as 10° MeV. Find the energy of the photon after the collision (See [4, 
p.123)). 


Ans. We shall use Eq. (8.103), but replace m, with m,, the mass of a proton. 
In this case 


(T —1)m,c? = 10", 


or, r—1z10!!. (8.105) 
Take T = 10t. 
The denominator in (8.103) is 
ATE _ 4x 101! x 10-3 


denom = —+1l=—————_—_ + l = 1.4. (8.106) 
m, c2 109 


by 
p 


Hence, 


The extremely “weak” photon gains an enormous energy of = 10! 
MeV from the impact with the cosmic ray proton. 


8.9.5. Example 5: Doppler effect 


Example 1. A fast moving space station is moving with speed fc along a 
straight line and is emitting light at a frequency fọ. An observer O is located 
at a distance h from the straight line shown in Fig. 8.6. We are required to 
calculate the frequency f of the same light as detected by the observer as 
function of time. 


The first thing to remember is that when an observer O receives light 
from a moving source S at time t, this light comes not from the present 
location C of the source, but from the retarded location B of the source, at 


the coordinate x, and emitted at the retarded time ť', as explained in Fig. 8.6. 
These two times are related by the equation: 


Fig. 8.6 Source moving at impact parameter b. 


where is the distance between the source S and the observer O at the 
present time t. When we use the Doppler formula (3.31) 


the angle 0 is the angle at which the source was located at the retarded time 
t'. That is 


Our first task is to find a relationship between x and t. It is assumed that 
the source is moving with speed cf along the X-axis. We set the origin of 
the X-axis at A, vertically above O, and set time t = 0 when light emitted 
from A is received at O. We shall find the solution in several steps. 


Step 1: Find , i.e. the time when the source was at A. 


In formula (8.108) set t=0, =, =h. 


Step 2: Find a relation between the time ct’ and the displacement x. 


Referring to Fig. 8.6, S was at A, when ct’ = et’, = —h, at P when time ct’ = 0, 
at B when time ct’ = ct’. S is moving with speed cf. Referring to Fig. 8.6 


Step 3: Find a relation between t and x. 


From (8.108a) and (8.112) 


which is a quadratic equation. To get the roots we simplify the above 
equation by setting h = 1, x = ct to the form: 


If we set x = 0, corresponding to t = 0, we get two solutions: 


The first solution x = 0 corresponds to light originating from the emitter at t 

= —h/c (retarded location) and coming vertically down to the observer at t = 

0. The second solution corresponds to light starting from the observer at a t 

= 0 and reaching the emitter at x = 2y*B(advanced location) after time R/c. 

here R? = h? + 2y*B?. We shall adopt first solution (i.e. with the + sign). 
Now we go back to the Doppler formula (8.109) and set 


Before we plot the frequency ratio as a function of time, we shall 
specialize Eq. (8.109) for two specific cases: O = m/2 corresponding to 
transverse Doppler effect,and @ = 0, corresponding to longitudinal Doppler 
effect. We collect the formulas for these two cases from (3.32) 


Before plotting the frequency ratios for three values of the emitter velocity, 
namely f} = 0.8, 0.9, 0.95, we shall obtain numerical values of the above 
ratios so that we can check the same values are obtained from the plots. 


In Fig. 8.7, we have plotted the frequency ratios as a function of the 
observation time. The above values are clearly reflected in the plots. 


We shall add a numerical “feel” of the results obtained, assuming the 
light is emitted by a sodium gas vapour lamp of which the wavelength is 
5890 A. The corresponding frequency is fy = 5.09 x 1014 Hz. The observed 
frequencies will then be The observed light falls in the Near Infra Red 
region, as shown in Fig. 8.7. 


Fig. 8.7 Frequency ratio as a function of observation time. 


8.10. Exercises for the Reader II 


R1 Consider elastic collision between two billiard balls A and B as in Fig. 
8.3. The velocity of the incoming ball A is Let the scattering angle (i.e. of 
the ball A after the collision) be 0 = 30°. Find the following quantities: 


(a) The dynamic Lorentz factor of the incoming ball A. 

(b) The boost velocity cp (of the ZM frame with respect to the Lab frame), 
and the boost Lorentz factor y. 

(c) The bouncing angle @ of the ball B after the collision. 

(d) The 3-momenta Pa-ins Pa-our PB-ins PB-our Of the balls A and B before 
and after the collision. 

(e) The total energies of the balls A and B before and after the collision. 

(d) The En-Menta of the system of two balls before and after the collision. 

(g) The rest mass Mp of the system of two balls in the Lab frame S. 


Ans. (a) 
(d) 


(e) 


(f) 


(g) Mo = 2.66Mo. 


R2 Consider the same scattering as in the previous exercise. Assume 
symmetric scattering of the two balls after collision, so that @ = 0. Let the 
dynamic Lorentz factor of the incoming particle be I. Show that the balls 
will bounce out with an angle © between them given by the formula: 


where K represents the kinetic energy of the incoming particle A. 


R3 A high-pressure sodium lamp contains vapourized sodium at a 
temperature of 2700 K. The sodium molecules are in random thermal 
motion with an average kinetic energy of where k = 1.38 x 107? J/K = 
Boltzmann constant. A young scientist analyzes the spectral lines of the 
light emitted by the gas. Take the mass of a sodium molecule as m =46u. 
The wavelength of the sodium line is 5890 A: 


(a) Find the average velocity of a sodium molecule. 
(b) Find the broadening of a typical spectral line. 


Note that you will get the same answer if you apply the non-relativistic 
formula. 


Ans. (a) 5.7 x 10 °c. (b) 0.065 A. 


R4 Consider the decay of a A-particle (cited earlier in the first problem in 
Sec. 4.11). Initially moving with a relativistic speed, it decays into particle 
#1 (proton) and # 2 (pion), leaving an angle @ between their tracks, as seen 
in a bubble chamber. Let represent the En-Menta of A, p and 7#, 
respectively. Using conservation of En-Mentum and invariance of its norm, 
show that 


where p; = |p'|, i= 1, 2. 


[Hint: Simplify both sides of the equation: 


See (25, p. 186]. For the actual reaction, see [37, p. 689]. 
Dsee Ref. [26] and Problem 7-2 on p. 225 in Ref. [25]. 


Chapter 9 
Relativistic Rocket 


9.1. Introduction 


We shall present an example of how Minkowski’s equation of motion works 
by demonstrating its application on a relativistic rocket. A relativistic 
rocket, in principle and for all theoretical calculations, is the same familiar 
rocket that the students have studied in their mechanics books,* with the 
difference that the exhaust gas is ejected with a “relativistic speed” u and, as 
a consequence, the rocket accelerates to a relativistic speed in due time. 
What we call relativistic speed is roughly the range: c/3 < u S c, where c is 
the speed of light. Because of the relativistic velocities involved in this case 
Newtonian mechanics breaks down, and we have to use Minkowskian 
mechanics, in particular Minkowskian equation of motion (EoM). 

A relativistic rocket, i.e. a space-ship propelled by ejected gas to 
relativistic speed is not a reality [28]. However, one can still think of 
matter—antimatter annihilation rockets, and pion rockets for intellectual 
entertainment [29]. Even then the purpose behind our spending time on 
such an object is somewhat pedagogical. The exercises we are going to 
undertake are intended to sharpen ones understanding of Minkowskian 
equation of motion, employing 4-vectors. 

We have derived the mass equation for a relativistic rocket (see Eq. 
(9.8)) using momentum—energy conservation principles [30-33]. 

Some features of this chapter that may kindle a special interest in a 
student or a teacher of special relativity are as follows: 


e We have subjected the two important equations derived in this chapter, 
namely, (a) the mass velocity equation (9.8), and (b) the EoM (9.29) to 
the N.R. test, by which we mean that all relativistic equations that have a 


non-relativistic (N.R.) analogue must converge to their N.R. counterparts 
when v & c. 

e Taking u as the ejection velocity of the emitted gas/radiation, we have 
obtained two special solutions of the EoM, corresponding to (i) u = c/3 
and (ii) u = c. We have plotted the velocity—time relation for both the 
cases, and shown that the plot for the case (i) closely follows the plot for 
the corresponding formula for v = v(t) obtained using N.R. (Newtonian) 
mechanics, up to v © 0.5c. 

e We have adopted a four-dimensional Minkowskian approach to obtain the 
EoM of the rocket, using 4-vectors, e.g. 4-velocity, 4-momentum, 4- 
acceleration, 4-force. For this purpose, we have adopted a mathematical 
formalism outlined by Moller [34]. 


Since the motion of the rocket will be one dimensional, confined to the 
x-direction, a typical 4-vector will have only t-and x- components and will 
be written as A = (At, A*). 


9.2. The Rocket, Its Specifications 


Let us now take a look at the rocket of our discussion. We have illustrated it 
in Fig. 9.1. It is moving along the X-axis with velocity v(t)m/s with respect 
to an inertial frame S, which, for fixing the idea, we shall call the ground 
frame (GF). It is ejecting gas at a constant velocity —u m/s and its rest mass 
at a constant rate r= #2 kg/s, relative to its instantaneous rest frame (IRF) 
S,(@), thereby generating a reaction force (in this case a thrust force) . Our 
purpose is to find a formula for , and then the equation of motion (EoM). 

Note that we have labelled the IRF with the extra tag (©) to stress that it 
coincides with the rocket frame R at the event “©”, which, for fixing the 
idea can be taken as “@: rocket passes a space station A”. Every IRF has to 
be associated with one, and only one, event “©”. 

Three quantities are specified for the assessing the performance of the 
rocket: u, r and M; = initial rest mass of the rocket at the instant t = 0, when 
it starts with zero velocity. In this chapter, M = M(@) will stand for the 
instantaneous rest mass of the rocket at the event ©. When written as a 
function of the “ground time” t, it will appear as M(t). 


Fig.9.1. The rocket. 


Let us consider two infinitely close events ©, and Og (corresponding to 
the rocket passing two infinitely close space stations A and B on its path), 
the time-space coordinate differentials between them being (côt, 6x) with 
respect to S, and (côt, 0) with respect to Sọ. Between these events the rocket 
ejects a quantity of gas of rest mass do. Consequently, its own velocity 
changes (i) from v(t) to v(t)+év with respect to the GF S, (ii) from 0 to dv’ 
with respect to Sọ, and (iii) its rest mass changes from M(t) to M(t) + 6M. 
Note that the time differential between the events being infinitesimally 
small, ót is the proper time between the events. Also, note that, the rate of 
emission of the rest gas mass with respect to the rocket frame is which is 
taken as a constant. 

In summary, the performance of the rocket is decided by three 
specifications: (i) its initial mass Mj, (ii) the rate at which rest mass is 
ejected from the rear end with respect to the IRF, (iii) the speed —u, with 
respect to the IRF, with which this rest mass is ejected. These quantities, 
being in the specification book supplied by the manufacturer, are frame 
independent and are to be taken as constants. 


9.3. Review of the Non-relativistic Results 


We shall briefly review the non-relativistic (N.R.) rocket formulas so that 
we can compare the relativistic results with their N.R. counterparts. We 
shall drop the subscript “,” from dpo, because in the N.R. zone there is no 
such thing as proper mass. The N.R. formulas can be found in standard 
books on mechanics. We shall quote the following formulas [8]: 


In (9.1b), M(v) is the same as M(t), since v = v(t). Equation (9.1d) in which 
du is the mass of the gas ejected in time dt, is a restatement of conservation 
of mass. The relationship between the velocity differential and mass 
differential shown in Eq. (9.1a) is a consequence of (i) conservation of 
mass, and (ii) conservation of linear momentum. The mass ratio equation 
(9.1b) is obtained by integrating the differentials in (9.1a). Equation (9.1e) 


represents the EoM of the rocket. Equation (9.1f) gives the solution of the 
EoM, subject to the initial condition: v = 0 when t = 0. 


9.4. Relativistic Mass Equation 


We now have the following Lorentz factors, corresponding to the velocities 
to be used: 


As stated above, the rocket velocity changes from 0 to dv’, and it ejects a 
quantity of gas of rest mass éu, from the event @, to the event Op in the 
IRF S,. We shall write the components of the 4-momentum of the rocket at 
©, and O», and of the gas ejected between these events — all of them in So. 

At this point, we emphasize once again that M = M (©) = M(t) = M(q) is 
the instantaneous rest mass of the rocket at the event © and hence, is a 4- 
scalar. 

The (t, x)-components of the momentum 4-vectors we shall write below 
will follow from (8.17), in which we shall set m, = M(t). Also, note that the 
dynamic Lorentz factors are: g for the ejected gas, and I’ = 1 for the rocket, 
since v’ = 0, i.e. the rocket is momentarily at rest in Sọ .The 4-vectors 
written below have only (t, x)-components, and are valid in the IRF S : 


In the above stands for the change in 4-momentum of the rocket, and for 
the 4-momentum of the ejected gas, between the events ©, and Opg. We 


shall apply the conservation of 4-momentum in S,, using the data in Eqs. 
(9.3c) and (9.3d). 


Note that (i) the rest mass lost by the rocket equals the relativistic mass 
gained by the ejected gas, according to (9.4b), (ii) Eq. (9.4d) is valid in So. 
To validate it in S, we have to apply the velocity addition formula (4.6): 


Hence, (in the limit dv’ > 0), 


This transforms Eq. (9.4d) to 


Integrating (9.7) from t = 0 to t= t, setting M = M; (i for “initial”), and v 
= 0 at t= 0, we get 


One of the requirements of all relativistic formulas is that they must 
converge to the corresponding N.R. counterparts (if such counterparts exist) 
in the N.R. limit v/c > 0. In this case, the N.R. mass formula is (9.1b). We 
shall show that this requirement is satisfied by the formula (9.8), using the 
definition of the Euler number: 


def ,. / L/a f g 
e = lim(1 + x)! : (9.9) 


az—-0 


Proof. We set P = v/c. Then in the limit B > 0 


same as the N.R. formula (9.1b) 


(QED) 


Is the formula (9.8) valid when u = c? To make sure, we shall retrace the 
steps from Eq. (9.3) downward, specializing them to u = c. Instead of 
assuming that the gas is ejected with velocity —u with respect to IRF S,, we 
shall assume that, between the events @, and Op, a beam of photons is 
emitted in the -x-direction with energy óE, with respect to the IRF S,. In 
this case, we use Eq. (8.26) for photon momentum: 


AtO,: P = M(c,0). (9.11a) 


AtOp : P +8P = (M +6M)(c, ôv’). (9.11b) 
— 
SP = (M c, Mõrv’). (9.11¢) 
` ôE ôE, 
ôP = (1-2). (9.11d) 
C C 


In the above stands for the 4-momentum of the emitted photon. 
We shall apply the conservation of 4-momentum in S,, using the data in 
Eqs. (9.11c) and (9.11d). 


It follows that Eq. (9.4d) will be valid for u = c. As a consequence (9.8) is 
also valid for u = c. We shall rewrite this for this special case: 


9.5. The Thrust 4-Force 


The 4-momentum of the rocket at the event ©4 can be written as (Q,) = 
M(@,)(@a). We have used Eq. (8.17), replaced mọ with M(@,). The time 
difference between the events @, and Oz is ót with respect to S, and ót with 
respect to the IRF S,. Differentiating with respect to tT we get 


- V. (9.14) 


where r = ——. (9.15) 


To get a parallel formula for the photon-driven rocket, we refer to 
(9.12b), and get 


dM 1 dE, K 
— = -——. (9.16) 
dr c* dr 

We can combine the two equations into one, assuming that the rocket is 
ejecting relativistic mass, either in the form of matter or in the form of 
radiation (we shall use the term radiation to mean photons), at the constant 
rate of e kg/s in its rest frame. 


Note that r is constant by assumption, and g is constant because u is so. 
Hence e is constant in (9.17a). We now assume that if photons are ejected to 
generate the reaction force, then is also constant in (9.17b). Then by (9.15) 
— (9.17) 


for both matter and radiation. 
We now go back to (9.14), and rewrite it as follows: 


is the “reaction 4-force”, or better still the thrust 4-force. However, we are 
using the symbol R instead of T, because the latter symbol can be confused 
with time. 

In the following equations, we write the (t, x) components of the 4- 
vectors in S,: 


In other words, the reaction 4-vector has the following components with 
respect to So: 


Note that the (t, x)-components of the reaction 4-force in Sọ are in 
agreement with (8.36). 


We shall now find the (t, x)-components of the reaction 4-force: , in the 
ground frame S, applying Lorentz transformation Eq. (7.77a), 
corresponding to the boost: S,(—cB, 0, 0)S, to the (t, x)-components of in 


the IRF S,, shown in (9.21). 


Note that the (t, x)-components of the reaction 4-force in S are in 
agreement with (8.36), which we rewrite in the present context as 


Referring back to Eq. (9.17) 


e For radiation emission . 

e For matter emission R = £u = gru = gT, where T = ru is the same thrust 
force of non-relativistic mechanics. See Eq. (9.1c). It changes to R = gT 
as it enters the relativistic domain. 


9.6. The Equation of Motion 


We return to Eq. (9.19a) and write the equation of motion 
MER, (9.24) 


where M = M (O) is the instantaneous rest mass of the rocket at the event O, 
and is a 4-scalar. All we now have to do is to write the x-component of the 
4-vectors on either side of the equation, and simplify the same to obtain the 
acceleration a =“ of the rocket in the GF S. We shall, however, find it 
convenient to work out the acceleration a, = in the IRF So, and convert 
this acceleration to a using Eq. (8.46). 

Consider the x-component of using (8.8). The kinematic quantities in S, 
will be identified with “prime”. Then, 


Pf ITA } < 
dV 7 d(Tv") ,dv' d”, dv a 
emg aoe i a ee eee aa (9.25) 


dt dT dr dT dT 


since v’ = instantaneous velocity of the rocket in Sọ = 0. Consequently. T" = 
1. 

From (9.21), the x-component of We thus get a simple looking equation 
of motion, which is valid in Sọ. 


M(9)a, = ou = constant. (9.26) 


Mass x acceleration is constant. But mass is not constant. Hence, the 
acceleration in IRF So is not constant. 
We shall write the EoM in the ground frame S, by converting a, > a, 


the acceleration in the ground frame S, using (8.16), which gives ao = Ia: 
M(O)Tĉa = ou = constant. (9.27a) 


du 
or, M(9) P = ou = constant. (9.27b) 
( 


Now we rewrite Eq. (9.27b), using the mass equation (9.8): 


where M; is the initial (rest) mass of the rocket. 

We shall set B = v/c, and c/u = n in the index of the leftmost factor in 
Eq. (9.28). Here n > 1 is a positive real number greater than or equal to n = 
1 corresponds to u = c. On the other extreme n > œ would converge to the 
N.R. EoM shown in (9.1e). We now simplify the left side: 


1 pP 
= 7a. aiaia [( 1+ 5)(1-— B)\-3/* 


The EoM (9.28) now takes a simpler form 


—————— = — = k = constant. (9.29) 
(1+6 


dt cM; 


gi pu 


Let us rewrite the mass equation (9.8), setting v = ch; c/u = n: 


= = ay (9.30) 
M; \1+8 _ 


We assume that the rocket has no payload, all its mass will ultimately be 
ejected out to provide the thrust. In other words, the rocket operates until M 
> 0, which happens when $p > 1. 

We shall now show that the above EoM (9.29) will converge to the non- 
relativistic EoM as given in (9.1e). We shall set e = gr as per (9.17a), p > 0 
and use the definition of Euler’s number e, as in (9.9). 


Proof. 
(l-— 5 — 8-40 1 1 
a+ (1+ 4) 2 4+) 
7 1 7 1 
~ (14+ 8)" (14+ B)e/# 
1 1 
(1 + 8) ly B]e u ev/u 
Substituting this in (9.29) we get: 
dv , 
Me™ *— = pu = gru = ru, since g > 1. 
dt 
Thus, we get back (9.1e). 
(QED) 


9.7. Solution of the EoM for Two Special Cases 


We shall find solution of the EoM for two special cases, namely (i) n = u/c 
= 3, and (ii) n = u/c = 1. The first case corresponds to the transition zone 


from non-relativistic to relativistic domain; the latter corresponds to a 
photon rocket. 


Example 1. Set n = 3, implying u = c/3. 


The reason for choosing n = 3 is two-fold: (1) the EoM shown in (9.29) will 
assume the simplest form, the numerator within the square brackets 
becoming 1; (2) we are now at the threshold of transition from the non- 
relativistic (N.R.) to the relativistic domain, the T-factor is very close to 1, 
in fact 9 = 35 = 1.06. Our results obtained here should be close to the N.R. 
results, so that we may feel comfortable that we are on right track. 

We specialize the EoM (9.29) for this special case: 


1 d8 Dp 
eae — = k3 = constant. (9.32) 
(1+ 8)3| dt 3M; 


Integration, subject to the initial condition: 6 = 0 when t = 0 leads to the 
following solution: 


The reader can verify the answer by differentiating 6 with respect to t. 

We have written to to mean “critical time”, when M > 0, as explained 
below Eq. (9.30). In other words, te is the time at “burn out”, assuming that 
the rocket has no payload, all its mass has been ultimately ejected out to 
provide the thrust. This happens when f > 1. From Eq. (9.33): 


We have plotted the velocity—time relation (rather the 6-t relation) in 
Fig. 9.2(a), using Gnuplot. On the same graph, we have also plotted the 
N.R. equation (9.1f). 

We have set M; = 1 kg, r = 1kg/s. Setting u/c = 1/3 in the first of the 
equations in (9.2), we get Hence e (defined in Eq. (9.17a)) = gr = 
1.06kg/m. Therefore From Eq. (9.34) the critical time is which has been 
set as the upper limit on the t axis. 

The two plots, relativistic and non-relativistic almost coincide up to t * 
0.8s, P 7 0.5. 


Fig.9.2. Case I, n = 3. Plots for (a) velocity vs. time; (b) acceleration vs. velocity. 


In Fig. 9.2(b), we have plotted However, in this case we set the vertical 
axis to represent the independent variable 6, matching it with the vertical fp- 
axis of Fig. (9.2a). The horizontal axis, pointing to the left, represents the 
dependent variable We achieved this configuration by first plotting the 
usual way, then turning the plot anticlockwise by 90°. Our objective here 
has been to check whether the slope of the p — t-curve in Fig. 9.2(a) is 
corroborated by the measure of df/dt in Fig. 9.2(b). In order to judge the 
correspondence, we marked four selected points on the curve (a) and their 
corresponding points on (b), wrote the values of for these points on the 
upper horizontal axis of the plot box. Fair correspondence between these 
values in Fig (9.2b) and the corresponding slopes in 9.3(a) is discernible, 
suggesting that Eq. (9.33) is the solution of the EoM (9.32). 


Example 2. Set n= 1, implying u = c. 


This is the photon rocket mentioned in the Introduction. In this case, a jet of 
photons flowing out from the tail end of the rocket is serving as the 
propellant. We specialize the EoM (9.29) for this special case: 


Integrating from t = 0 when B= 0 tot=t; B=B 


we get 


or 


We have plotted the velocity—time relation (rather the 6—t relation) in Fig. 
9.3(a). However, in Eq. (9.37) t is a function of p. Hence, using Gnuplot we 


first obtained f as the horizontal axis and t as the vertical axis. In order to 
reverse their roles, we transformed the plot by (i) a rotation through 90° in 
the anticlockwise direction, followed by (ii) a reflection about the vertical 
axis (i.e. about the new f-axis). 


Fig.9.3 Case II, n = 1. Plots for (a) velocity vs. time; (b) acceleration vs. velocity. 


In Fig. 9.3(b), we have plotted (its axis pointing left) vs. p (its axis 
pointing upward). The procedure, objective, and explanations are the same 
as in Example 1. 

It should be noted that in this case e is given by (9.17b), which we 
rewrite and interpret as follows: 


For the plottings we have taken M; = 1kg, e = 1 kg/s. 

How long does the rocket operate? Until 6 > 1, as mentioned below 
Eq. (9.30), and therefore, by (9.37), until t > ©, 

It is seen from the plot in Fig. 9.3(a) that P approaches unity (or, v 
approaches c) asymptotically. 


aSee [7, pp. 327-330]; [8, pp. 84-87, 144-147, 315-319, 328-332]. 


Chapter 10 


Magnetism as a Relativistic Effect 


10.1. Velocity-Dependent Force from a Velocity-Independent 
One under a Lorentz Transformation 


We shall go back to Lorentz transformation of 4-force outlined in Sec. 8.7.4 
and reconsider how a force F’ on a particle at a particular event in S' wil 
transform into a force F in S at the same event under the boost S(, 0, 0)S' 
so that B = Si. Specialize the force transformation formula (8.56) to this 
boost. 


F _ Fat (FY v’). F y F F; 


j 1+8 °> (14+ Bt)’ 7 4(14+ Bvt)’ 


Note that 


m d i7 


Bv’. 


F! + B(F' -v') = F! + = F' (1 + Bvi/e)+ A (Fv! + Fivt). 
aw 
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F' By’ ~ 3 
— — y(1 — 8? F’ = | € + —)—-—(vi +e 3)| F. 
^ : Cc Cc z 


(10.2) 


Therefore, 


Hence it follows from Eq. (10.4) and the velocity addition formula 
(4.23b) that 


It will be a simple exercise to show that 


We can now write the force whose components are given in Eq. (10.4) 
as 


We also break up the force F’ into two components, viz., a component 
which is parallel to the boost velocity cR and a component F, which is 


perpendicular to c8, so that, and so that for the special boost S(f, 0, 0)S', 


The general form for the force transformation formula, valid for a 
general boost S(B)S’, now follows from Eq. (10.6): 


It is then seen from Eq. (10.8) that the transformed force F has a 
component Fg which is velocity independent if F’ is so, and a part v x G 
which is explicitly velocity dependent provided G is not zero. We are 
therefore led to the following theorem: 


Theorem 10.1. Suppose a particle P moving in an arbitrary trajectory 
experiences a purely velocity-independent force F' as measured in an 
inertial frame of reference S'. Suppose this frame S' is moving with velocity 
cB with respect to another inertial frame S, and that the velocity of the 
particle at some instant t is v in S. Then the force F, as measured in S at the 
time t, will be velocity dependent, being the sum of a velocity-independent 
component Fo and a purely velocity-dependent component v x G as given 
by Eq. (10.8). 


10.2. How Magnetic Force Originates from Lorentz 
Transformation 


The starting point of the principles of electromagnetism has of two parts, 
namely (1) the Lorentz force equation which defines the electric field E and 
the magnetic field B in terms of the force F that a charged particle q will 
experience under the influence of a distribution of electric charges and 
currents, and (2) Maxwell’s equations. Using these two sets of equations, 
one can understand and explain every phenomenon in the domain of 
electromagnetism, including attraction and repulsion between electric 
charges, electric currents, magnets, operations of motors and generators, as 
well as the propagation of electromagnetic waves. 

Let us now consider a distribution of charge which is static in an 
inertial frame S'. Only an electrostatic field E' exists in this frame, so that 
the force experienced by a moving test particle of charge q is the velocity- 
independent force F' = qE’. If the distribution is in bulk motion with 
constant velocity u = c& as seen from another frame S, then S and S' must be 
related to each other by the boost: S(&)S’. It now follows from the above 
theorem, in particular Eq. (10.8), that the same particle will be seen to 
experience a velocity-dependent force F in S, which is given by the formula: 


since Line (a) in the above equation is the Lorentz force equation. Lines (b) 
and (c) show how a pure electric field E’ in the frame S' transforms into a 
combination of electric field E and magnetic field B in the frame S. Hence 
the main conclusion of this chapter. 


Conclusion: A distribution of charges when moving with uniform velocity 
u = ch, will create, in addition to an electric field E, also a magnetic field 
B. The emergence of the resulting magnetic force can be linked to the 
Lorentz transformation of contravariant 4-vectors which itself is a 
consequence of the postulates of special relativity. In this sense magnetism 
is a relativistic effect. 


Note that the above exercise does not shed any light on what happens 
when the charge distribution moves with an arbitrary (non-uniform) 


velocity. The field resulting from such a motion can be worked using the 
full set of Maxwell’s equations. 

We shall come back to Eq. (10.9) in Sec. 11.4 through Eq. (11.37), and 
in Sec. 11.6 through Eq. (11.63b) — by a different route, namely, Lorentz 
transformation of the electromagnetic field tensor. 

It may be appropriate to close this discussion with a quotation from 
Leigh Page.® “The rotating armature of every generator and every motor in 
this age of electricity are steadily proclaiming the truth of the relativity 
theory to all those who have ears to hear.” 


T Page, Lecture at December 17, 1941 meeting of the American Institute of Electrical Engineers, 
New York. Quoted at the beginning of Chapter 3 of Ref. [5]. 


Chapter 11 


Principle of Covariance with 
Application in Classical 
Electrodynamics 


11.1. The Principle 


Physics is geometry. Consider Newton’s second law of motion: 


'— —, (11.1) 


The left-hand side is a “prescribed vector” — implying thereby a straight 
line segment of pre-specified “length” and “direction”. The right-hand side 
is a “constructed” vector, obtained through operations like “parallel 
transport”, drawing a “vector triangle” and division of one of its side with a 
3-scalar dt, as illustrated in Fig. 11.1. 

A particle, while in motion, traces out a certain path I. Two nearby 
locations A and B on this path are reached by the particle at times t and t + 
ót. Its 3-momenta at these two instants are p(t) and p(t + ót), respectively. 
The average force F between these two instants, multiplied by the time ôt is 
(approximately) equal to dp, as illustrated in the upper box. 

The ratio of the change in the momentum vector dp to the time 
difference ót is the average force F acting on the particle between A and B. 
This average force becomes the instantaneous force F at the instant t if we 
make the time difference infinitesimally small, i.e. when ót > dt. 

The process employed in this example epitomize the structure of most 
physical laws. The Lord of the universe conceived physical quantities of 


nature as “geometrical objects” and ordained them to shape through rules of 
geometrical constructions into different, but all the same, geometrical 
entities. 
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Fig. 11.1. Graphical construction of Newton’s second Law of motion. 


Even though the geometrical picture may not be overtly manifest in the 
equations of physics, geometrical concepts and constructions are subtly at 
work behind the structure of physical laws. This is because scalars, vectors 
and tensors are all geometrical objects, as we highlighted at the end of Sec. 
5.1.1, and illustrated in Fig. 5.1. The shape of every physical equation — 
which either establishes a law — or proclaims a certain rule, convention, 
definition or relation — must be such that the geometrical objects on either 
side of the equation have come down to the same rank through processes of 
geometrical construction. This, in nutshell, is the Principle of Covariance. 

Apparently then this principle is as old as Newtonian Mechanics. 
However, the advent of relativity adds extra dimension to this principle by 
reminding us that the physical quantities are no longer geometrical objects 
of the visible three-dimensional world, but are inhabitants of the four- 
dimensional space-time. 

The Principle of Covariance declares that the mathematical expressions 
of all physical laws must be written in such a way that either side of the 
equation is a 4-tensor of the same rank and same sequence of contravariant 
and covariant indices. When an equation satisfies this demand, we say that 
it is covariant. By corollary, a law or a relationship which cannot be written 
covariantly, must have a limited application, spatially or temporally, and 
cannot be regarded as a universal law. 


The law of motion written in the forms (8.28) and (8.33) are two 
examples of covariant equation. 

In this chapter, we shall recast the familiar equations of classical 
electrodynamics, the equation of continuity, Maxwell’s equations, the 
electromagnetic energy-momentum conservation laws in the covariant 
forms. 


11.2. The Flux of a Vector Field in E? 


11.2.1. 2D surface embedded in 3D space, and the outward 
normal 


Figure 11.2(a) shows a two-dimensional surface S in the three-dimensional 
Euclidean space. The equation for such a surface is usually represented by a 
mathematical equation in the form 


Implicit form V = U(2,y,2) =C, C = constant. 
(11.2) 


Parametric form x = f (u,v), y = glu, v), z = h(u, v). 


In the implicit form, it is not possible to tell from the equation which one is 
the independent variable, because all variables are treated as equal. In the 
parametric form, a point (x, y, z) on the surface S is determined by two 
parameters, suggesting that the surface is a two-dimensional object, 
embedded in the three-dimensional space spanned by the (X, Y, Z)-axes. 

The simplest example of both can be provided by the equation of the 
surface of a sphere of radius R, and having the centre at the origin, shown in 
Fig. 11.2(b). 


(a) 


Fig. 11.2 (a) Surface and its normal; (b) a spherical surface shown with unit vectors in the spherical 
coordinate system. 


Implicit form Y = x? + y? + 2? = R?. | 
i i (11.3) 
Parametric form x = R sin cos¢@, y = R sind sing, z = R cosð. 


In the second case, the spherical coordinates (0, @) serve as the parameters 
(u, v). 

Surface integration of a vector field involves the unit normal vector n 
drawn at each point (x, y, z) on the surface S. The gradient vector VW points 
in the direction of the outward normal (i.e. in the direction in which it is 
increasing). Hence 


_\. [VY¥(r) 
n(ro) = (wren). ar) 


is the outward normal at a point r = rọ on the surface. As an example, the 
outward normal drawn at a point rp on the spherical surface S of Eq. (11.3) 
is given by the following expression: 


n(rp) = (ge) = e,(To), (11.5) 


where (e;, eg, eg) are the three unit vectors on the surface, associated with 


the spherical coordinate system, being in the directions of increasing r, 0, @, 
respectively (indicated by the curves C,.Cg, Cg, respectively). Of these 


three, e, is identified with n, the normal vector, the other two being tangent 
vectors to the surface. 


11.2.2. Surface integral, flux of a vector field 


Consider a vector field F(r), “flowing” through an open surface S which is 
“oriented” (i.e. its unit normal at each point is uniquely defined). This 
surface S can be divided into a very large number N of small patches ôa}, 
ôa», ..., Ody, the largest of them having an area of ôa; i.e. da; < da; 1 <i < 
N. Consider one such patch óa; whose centre is located at the coordinates rj. 
Let the unit normal vector on this patch be nj. The vector field at r; is E(r;). 
Then the surface integral of the vector field F(r) over S, represented by the 
symbol øp is defined as the limit of a sum, as follows: 


YF = F (r) - da lim 3 F(r;) - nj da;, (11.6) 
n ga Ss i N = oo: òa—U 


and is called the flux of the vector field E(r) across the surface S. 

We shall not elaborate on this further, but illustrate the surface 
integration with a familiar example, taking the vector field F(r) to be the 
electric field E(r) emanating from a point charge Q sitting at the origin. The 
surface S is the upper hemisphere with the centre at the origin, i.e. the 
surface of Eq. (11.3), expressed in the spherical coordinate system as 
follows: r = R; 0 < 0 < 7/2; 0 < Q < 2n. 

In this case F (r) = E (r) = 3er, and da = r? sin 6 d0 d@ e,, so that 


E-da= — sin dð dd. Hence, 
2E = f T sin # d dé = = (11.7) 
Areo 8=0 Jo 4iE0 


is the flux of the E(r) field across the hemisphere. 
The flux of a vector field is closely associated with Gauss’s divergence 
theorem, which we state here as follows: 


Theorem 11.1. Let F(r) be a vector field which is continuous, along with 
continuous derivatives in a region of space R. Let S be a closed piecewise 


smooth surface in R, forming the boundary of the volume V within. Let n(r) 
be a unit outwardly directed normal vector on S at a point r on S. Then 


I VF) r= ff F(r)-nda= ff F (r)- da. (11.8) 
V S S 


11.2.3. Continuity equation 


Let us consider a fluid in streamline motion® as in Fig. 11.3. This fluid is 
characterized by a velocity field u(r, t) and a fluid mass density o(r, t), both 
of which, in general, are unsteady fields, i.e. functions of t as well. 

In pre-relativistic physics mass is conserved. There is an equation, 
sometimes called the equation of continuity, which expresses this 
conservation rule, namely, 

ðo 


— + V . (ou) = 0. (11.9) 
öt 


Note that the product ou of fluid mass density and fluid velocity is the fluid 
flux density. 

Conservation equitations of various physical quantities have a structure 
which is similar to Eq. (11.9). Since conservation equations play a very 
important role in physics as well as in the remaining part of this chapter, we 
shall demonstrate how the above equation is derived. 


Fig. 11.3. Illustrating mass conservation. 


Consider the imaginary closed surface S, fixed in space, and embedded 
inside a stream of fluid. Since fluid masses flow in and flow out of the 
space V inside S, the content of fluid mass in it is a function of time. We 


write it as follows: 
Mit) = fJ) olr, tdr. (11.10) 
JJJ V 


The change of mass over time dt is 


re re l rrr Aalr.t) . 
dM = fj) o(r,t +dt)dr — ITI a(r,t)dr = (/// dr) dt. 
JJJ V JJJ V JJJ V Ot 


(11.11) 


Conservation of a physical quantity means that if the content of this 
quantity inside a fixed volume V has increased by a certain amount over a 
given time, the same amount must have flowed in through the boundary 
surface S in this time. Applied to mass conservation, it means: 


= — (/// v. Iot, tjur, Ddr ) dt. (11.12) 
yv 


We have used Gauss’s divergence theorem to convert the first equality to 
the second one. The minus sign appears because the surface integral, if 
positive, would mean an outflow (the unit vector n is an outward normal). 
Equating the right-hand sides of the above two equations, we get 


re Jo(r,t) \ . 
II + V -a( r.t)u(r,t)) dr 
JJJ v Ot 


Since the result is valid for arbitrary V and arbitrary dt, the integrand is 
zero. Hence Eq. (11.9) follows. 


dt = 0. (11.13) 


(QED) 


All conservations laws have the same format as in Eq. (11.9), namely 


ö 
zl volume density) + V - (flux density) = 0. (11.14) 
Å 


We have written “volume density” in italics, and “flux density” in bold, 
to indicate that the former is a scalar, and the latter a vector. 

How shall we write the continuity equation covariantly, i.e. in the 
context of relativity? By (1) making sure that the quantity in question is 
conserved, e.g. energy of a stream of particles flowing like a fluid, the 
electric charge contained in a charged fluid, as in Sec. 11.3; (2) multiplying 
the numerator and the denominator of the first term in (11.14) with c. Let £ 
(pronounced varrho) represent the volume density, and J the corresponding 
flux density of this quantity (e.g. energy, charge). By definition, the flow of 
the quantity per unit time across the surface element da = n da is given by 
the relation 


dp = J -da. (11.15) 
We now rewrite the continuity equation (11.14) in the following more 
precise language: 


O (co) 
— +V-7=0. (11.16) 
Olct) 


Written covariantly, as in Eq. (11.29) 


VJ" (x) =0, 
where JH = (co, TJ) 


(11.17) 


represent the time and space components of the 4-vector density 7 of the 
assumed conserved quantity. 


11.3. Conservation of Electric Charge 


There are only two types of force which can be understood in the “classical 
language”, i.e. without using quantum mechanics. One of them is the force 
of gravity — the most commonly and universally experienced force of 
nature. This force, however, comes under the purview of General Relativity. 


Moreover, this force is too weak compared to the electromagnetic forces to 
have any effect at all on the motion of subatomic particles which can be in 
relativistic motion at the laboratories. 

This leaves us with only one force, namely the electromagnetic force, 
which has a classical structure. We shall recast this classical structure of 
electromagnetic theory into a covariant form in order to illustrate the 
language of covariance. In the discussions to follow we shall not present 
any detailed discussion or derivation of the formulas, for which the reader 
has to look into standard books on electrodynamics. 

We shall start with the basic postulates of electrodynamics and express 
them first in the “classical” language and then in the covariant format. 


Postulate I. q is the electric charge of a particle, moving or stationary, then 
the measure of q is the same in all inertial frames. In other words, q is a 4- 
scalar. 


Postulate II. Electric charge is conserved. This charge conservation law is 
customarily expressed in the form of the continuity equation (Sec. 11.2.3): 
Op(r, t) 
Ot 


+V-J(r.t) =0. (11.18) 


In the above equation, p(r,t) and J(r,t) represent, respectively, the charge 
density and the charge current density at the event © = (ct, r) = (x), as 
measured in a given frame of reference S, which for explicitness we shall 
call the Lab frame. 

If the electric current distribution is due to a “streamline motion” of an 
electrically charge fluid, as shown in Fig. 11.4, and if u(r, t) is the stream’s 
3-velocity (in the Lab frame) at the event (x), then 


J(r,t) = p(r,t) u(r, t). (11.19) 


We can think of a comoving frame of reference S, moving with the 
charge stream at the event (x). The charge density po(r,t) = p,(x) at (x), 
measured in the frame So, will be called the proper charge density of the 
fluid at (x), and will be treated as a 4-scalar density. The dynamic Lorentz 
factor of the fluid at (x) is 


(11.20) 


Fig. 11.4. Lab frame S and comoving frame Sọ in a streamline flow of particles. 


Imagine a collection of particles containing a quantity of charge ôq, 
enclosed within the boundaries of a box whose volume is V in S and óV, in 


So, so that 


ôq = po(x)dV, = p(x)dV. (11.21) 


Lorentz contraction of the dimension of this box along the direction of u, 
changes its proper volume 6V, to the laboratory volume 
lv, (11.22) 


ôV = - él 7 
riz) ° 


By (11.21) and (12.3) 


(xr) = T(r) p(x). (11.23) 
f VPA | 


The 4-velocity of the charged fluid at (x) is, according to (8.7) 


U* (x) = T(2)(c, u(z)). (11.24) 


Therefore, we define the 4-current density of the electric charge fluid as 


J* (x2) = po(x)U¥ (x). (11.25) 


From (11.25), (11.24) and (11.23) 


J¥(x) = pol (x) (c, u(x)) 
= (p(x )c, p(x)u(ax)) (11.26) 


= (p(x Je, J(x)). 


Thus, p(x) times c constitutes the time component, and J(x) constitutes the 


three space components of J (x). Let us rewrite the continuity equation 
(11.18) as 


O(cp) ðJ, OJ, ðJ, 112 
k - — = 0. (11.27) 
O( ct) t ör t öy t O2 l k 


The 4-gradient operators V,,, V” were introduced and elaborated through 
Eqs. (7.72), (7.113). We rewrite the first one explicitly as follows: 


y Oo Oo O Oo ð ` 8 c 112 
ll —, —, —, — — — : ( 238) 
= Ore ar’ rl’ x2’ Ors Oct 


The charge continuity equation (11.18), rewritten in (11.27), now assumes 
the form: 


VJ" (2) = 0. (11.29) 


The left-hand side of the equation appears like a contraction, reducing it to 
a 4-scalar. The right-hand side is also a 4-scalar, having a single component 
0. Hence, it is a covariant equation. 


11.4. The Electromagnetic Field Tensor 


Let us make the third postulate of electrodynamics. 


Postulate III. The force experienced by a particle carrying an electric charge 
q is a velocity-dependent force, called Lorentz force (see Eq. 10.12), written 
as 


F =q(E +v x B), (11.30) 


where v is the velocity of the charged particle at the event point (x). The 
above equation also serves as the definition of the electric field E and the 
magnetic field B at the location of the particle. Let us write the dynamic 
Lorentz factor for the particle’s velocity: 

c. (11.31) 


/ ja 
I1- = 
ca 


y 
Substituting the Lorentz force (11.30) in (8.36), the time and space 


components of the corresponding Minkowski force g# are now obtained 
compactly as 


1 
F= F" = qr (+Ľ-v E+vxB), (11.32) 
p 


and in an expanded form as 


: T r , 
go _ _P.y = L (Ect, + Eyt + E.vz), 
a 


( 


ql, 


Fi —TF, = —(Ezc+ cB.vy — cByvz), 
( (11.33) 
: I 
=F, = —(Eyc+ cBzv, — cBzvz), 
e 
n qr, 
F? — TF, = —(E,c + cByvz = cBrvy j. 
C 


The above equation tells us that the Minkowski 4-force acting on a charged 
particle q is a linear function of its 4-velocity, and therefore can be written 
as 


(11.34) 


In the above equation, Vy is the covariant form of the 4-velocity vector V”, 
defined in (8.7) and is obtained from it by the lowering operation, 


Va = Guy V? = (Te, I'v) = (Ce, Tus, Tv,, Tu). (11.35) 
F”, as defined by (11.34), is a very important contravariant tensor of rank 


2, called electromagnetic field tensor, and must have the following 
components as suggested by Eq. (11.33): 


0 1 2 3 
0 0 -E, —-E, —E: 
Fey=>1{E, 0 —cB, cB, F": = F". (11.36) 
2 | E cB, 0 —cB, i 


3 \E: —cB, cB, 0 


The electromagnetic field tensor as written in (11.36) is an 
antisymmetric contravariant 4-tensor. Hence, its diagonal elements are all 
zero, and the off-diagonal elements on the upper side of the diagonal are 
equal and opposite to the corresponding off-diagonal elements on the lower 
side of the diagonal. Hence, it has only six independent components, and 
they are the six 3-scalar components of the (E, B) field. 

Equation (11.34) is the covariant expression of the Lorentz force 
equation (11.30). Using the Lorentz transformation formula (7.80) for a 
contravariant tensor, the reader should be able to obtain the following 
transformation rule for the components of the (E, B) field under the boost: 
S(B, 0, 0)S’: 

E! = Ez, cB! = cB,, 
E; = (E, — 8cB,), cB = y(cB, + BE.), (11.37) 


E; = y(E:; + 8cB,), cB, = (cB, — BE,). 


I 


To illustrate the procedure, we shall work out two of the above 
transformation formulas in details, namely for E}, and B!, using the Lorentz 
matrix as given in (7.69), to transform the components of the tensor (11.36): 


Ei = FY = 0199 pe? = 91,99, FO + 91,90, F0 
= 77(1-— BHE, = Ez. 
cB} = F” = 07,01 ,F8 = RNE? + 07,01, F2! = (eB, — BEz). 


(11.38) 


Note that even though each of the two lines above apparently involve a sum 
of 16 terms, corresponding to a, P = 0, 1, 2, 3 we have accommodated only 
the non-zero terms, which happen to be only two in number in each case. 


11.5. The Field Equations of Electrodynamics in the Covariant 
Language 


We shall now establish covariance of the field equations of 
electrodynamics. These equations, expressed in their more familiar non- 
covariant form, are known as Maxwell’s equations. They separate out into 
two sets of 1+3 equations, namely (1) the inhomogeneous equations 
containing the source terms, and (2) the homogeneous equations without 
the source terms: 


l 
V -E(r,t) = —ep(r,t), (11.39a) 
Ege 
Inhomogeneous part: 


DE (r,t) 1 l 
Y x cB(r,t) - —— = —Jir,t),  (11.39b) 
cot Egc 
V -cB(r,t) =0, (11.39¢) 
Homogeneous part: ‘ / \ 
S AcB (r,t) g 
V x E (r,t) + — > =0. (11.39d) 
CL 


To convert the inhomogeneous part, given by (11.39a) and (11.39b), we 
first expand them into four separate equations in terms of the Cartesian 
components of E, cB, J. Identifying these terms in the expanded expression 
as the components of J#(x) and F(x), with the help of (11.26) and (11.36) 
it should be easy to see that these two 1+3 equations fuse into a single 4- 
equation, i.e. one covariant equation: 


P 1 
Va F™ (xz) = — J” (x). (11.40) 


Ege 


We shall demonstrate explicitly how (11.39a) shapes into the p = 0 
component of (11.40): 


JE, OE, DE, = p as 
Left side = —= + —4% + —* = V F! 4 VF” + VF 
OF Oy Oz 
= VF. 
. . 1 0 0 1 0 Py 
Right side = — J”. Hence, V,FHY = — J". (QED) 


Ege Ege 


With the above hint the reader should be able to demonstrate equivalence 
between the x, y, z components of (11.39b) and the p = 1, 2, 3 components 
of (11.40). 

Now we shall convert Eqs. (11.39c) and (11.39d) into covariant form. 
For this purpose, we shall obtain from FY” its dual #” by the following 
operation: 


red 


s (2)= 


er g(x). (11.41) 


mle 


Here Fg is the covariant tensor obtained from F” by lowering both indices 
(Reader, confirm it): 


0 E, E, E, 
e -E 0 —cB, cBy . 
Fas = Jagg F" = (11.42) 
-E, cB, 0 —cB, 
—E, —cB, cB, U 


and "°P is the Levi-Civita symbol. We had defined this as a 3-symbol in 
Sec. 5.1.3, under Eq. (5.13). We now extend the same symbol to a 4- 
symbol, defined as 


U if any two indices equal, 
ehver — 4 1 if pvaß = 0123, or any even permutation of 0123, (11.43) 


—1 if uvagß is any odd permutation of 0123. 


For example, 1023 and 1203 are obtained from a single (i.e. odd) 
permutation and from two (i.e. even) permutations of 0123. The reader 


should prove the following important identity: 


4 y A -aB y „HYK A j \ 
OHM, OA, PT — a, (11.44) 


where û represents a proper Lorentz transformation matrix. As a 
consequence, we can regard the Levi-Civita symbol to be a contravariant 
tensor of rank 4, which transforms into itself under a proper Lorentz 
transformation. There is only one other tensor, namely, the metric tensor Suv 
which has a similar property. Going back to (11.41) and (11.42), the reader 
should work out the components of $+”, and show that 


0 l 2 3 
0 0 —cB, —cB, —cB, 
ov — 1 | cB, 0 E, -Ey (11.45) 
2 | cB, -E 0 E, | 
3 


cB, E, -E, 0 
It should now be a simple exercise to show that Eqs. (11.39c) and 
(11.39d) can be written covariantly as 


Vag = 0. (11.46) 


We shall demonstrate explicitly for one component, say p = 3. 
Vase? = Vos + Vids + V28” + Vg” 


ðo, B.) + 0» E )+ 0 E) 
= =——| — C ) —|— 6, ) mt £5. J 
dct = ðr'` | By 


O ODE.. ðcB, ` Ae _ 
--(- = +E) = veZ] =0. (QED) 
Or Oy Oct m 


In summary, Maxwell’s equations written in 1+3 forms in (11.39) 
reduce to two covariant equations: 


1 l 
Inhomogeneous part: Va F™ (x) = — J” (x). (a) 
Eoc (11.47) 


Homogeneous part: Vag” (x) = 0. (b) 


There is one more way to express the homogeneous equation, namely, 


Viren + VYF +yuFe =o. (11.48) 


Note that there is a cyclic permutation of the three indices p, v, n in the 
above equation. Also, due to antisymmetry of F” the left-handside is 
identically zero, unless u, v, 7 are all different. This equation therefore 
represents only four equations, namely corresponding to (p, v, n) = (0, 1, 2), 
(0, 2, 3), (0, 3, 1), (1, 2, 3). These four equations correspond to the 1 + 3 
equations represented by (11.39c) and (11.39d). The four components of the 
operator V” are shown in (7.113b). 

We shall verify equivalence between (11.48) and (11.39c) and (11.39d) 
for one combination, namely vun = 012, leaving to the reader verification 
for the other three combinations: 


VoFl24 VIFO | yee = x z ) _Ohy _Ol-Eas) 


=-|vxE+ — = 0. (QED) 


Therefore, Maxwell’s equations written in 1+3 forms in (11.39) can be 
expressed covariantly in the second way: 


i 2 
Inhomogeneous part: V a F°# (x) = —J* (x). 


Ege d (11.49) 
Homogeneous part: V4“ FF’) + VY Pm + VF” = 


There is a third way, perhaps a more powerful way, of expressing 
Maxwell’s equations, namely, by writing them in terms of potentials. The 
homogeneous equations, i.e. Eqs. (11.39c) and (11.39d), define the scalar 
potential ®(r, t) and the vector potential A(r, t) through the relations: 


cB (r.t)= V x cA(r,t), (11.50a) 
ðcA(r, t). 


E (r,t) = —V (r,t) — > 
cot 


(11.50b) 


because if we write (E, B) in the form of (11.50), the homogeneous part 
(11.39c) and (11.39d) of Maxwell’s equations will be identically satisfied. 
Adopting Lorentz gauge: 
d®(r,t)/e 
V -A(r,t) + —2— =0, (11.51) 
the source equations now reduce to the following inhomogeneous wave 
equations: 


V4 = —— = — pC, (11.52a) 
2 é = 


2A — feidm = —J. (11.52b) 


The set of Eqs. (11.50)-(11.52) constitute the potential form of 
Maxwell’s equations (11.39). Together they are equivalent to the complete 
set of Maxwell’s equations (11.39). We shall convert each of them into a 
covariant equation. 

For this, we first define the 4-potential A” to be a contravariant 4-vector 
with space and time components: 


A“ (x)= (Hai r), A(x) ) , (11.53) 
a 


and the d’Alembertian operator O°, obtained by taking the “scalar product” 
of the operator V, with itself, as shown in (7.1120). 

It is now obvious that the two 3-vector equations (11.50), defining the 
scalar and vector potentials, are equal to the following single “covariant 
equation”, obtained by replacing the components of (E, cB) and (®/c, A) by 
the corresponding components of F¥” and A¥, identified from (11.36) and 
(11.53), respectively: 


FY (x) = e[V" A" (x) — V” A” (x)]. (11.54) 


We shall demonstrate this equivalence for one specific example, namely, pv 
= 10. 


Ob OcA 5 


ör Oct 


Right side = c[V +A? —V°A'] = = E, = F". (QED) 


Also, the Lorentz gauge condition (11.51) is clearly seen to be 
equivalent to the following covariant equation: 


VA" (x)= 0. (11.55) 


Finally, the 1+3 inhomogeneous wave equations for the scalar and 
vector potentials, namely Eq. (11.52), now become one single covariant 
wave equation for the 4-potential: 


12 
~ of 


0? A“ (x) = LJ" r). (11.56) 
In summary, we have the Potential form of Maxwell’s equations: 


4-Pot defined: Fe’ (x) = c[V# A" (x) — V” A" (x))]. 


Lorentz Gauge: VA” (x) = 0. (11.57) 


Wave equation: OD? AK (2) = 
1 


We can supplement these equations with two more equations which 
represent the only invariants of the electromagnetic fields (i.e. they retain 
their values under a Lorentz transformation): 


F°®8F_, = 2(@ B? — E?), (11.58a) 


POBS A —?2c B.E. (11.58b) 


This exercise is best left to the reader. There are some interesting 
consequences of the above invariants, which the reader will explore through 
some examples in the problem set. 

Before leaving this chapter, we remind the reader the covariant 
expressions of the equations of Classical Electrodynamics, laid down in the 
form of 


(1) Continuity equation: (11.29); 


(2) Minkowski force (corresponding the Lorentz force equation: (11.34); 
(3) Maxwell’s equations in terms of the field tensor in two ways: (11.47), 
(11.49); 
(4) Maxwell’s equations in terms of the 4-potential: (11.57). 
We have framed each of these equations in a box to stress their 
importance. 


11.6. EM Field of a Charged Particle in Uniform Motion 


What we are doing in this section is partly a reflection of the statements we 
had made in Sec. 10.2. What we had done in that section using force 
transformation, will be done here for a very specific example, using field 
transformation. It is heartening to see that the force transformation and the 
field transformation, though they evolve from two entirely different 
offshoots of the special theory of relativity blend with each other, showing a 
larger homogeneity and consistency of the theory. 


11.6.1. Transformation from the rest frame to Lab frame 


The charge q is seated permanently at the origin of its rest frame S'. It is 
now viewed from the Lab frame S, moving in the negative X direction with 
velocity Bc, relative to S', so that the charge is moving in the positive X 
direction with respect to the Lab, as explained in Fig. 11.5. The origins of S 
and S’ coincide at t = t' = 0. Let us set k= 74. The EM field in the rest 
frame is 


Tr k(x'i +y'j + 2'k) 


oa cB’ = O. (11.59) 


r~ T 


where r°? =x? + y? +z’, 

Let us now consider an event © having coordinates (x, y, z, ct) in S and 
(x', y', z', ct’) in S'. Let the EM field at this event be (E, cB) in S and (F', 
cB’) in S'. The Lorentz transformation of the coordinates from S' to S is 
given by the formula (3.9), whereas the Lorentz transformation of the field 
components (at the event ©) from S' to S is the inverse of the equations 
(11.37), i.e. with p replaced by —f. 


Fig. 11.5. The rest frame S of the point charge q moving with velocity cp with respect to the 
Observer’s frame S’. 

Now consider the following event Op = “the EM field due to the above 
point charge q is detected by a certain observer P” situated on the XY-plane, 
so that z = z' = 0. Ignoring the z coordinate, we transform the field given in 
(11.59) to the frame S', using (3.9) and (11.37). In this first conversion, the 
radial distance r' transforms into R, which we write in its relation to S 


r? = y?(zx — Bety? + y? + fa R. (11.60) 


The EM field (E, cB) in S transforms into the EM field (E', cB’) in S’, 
whose components are given as 


P T y(x — Bet) , _ ie 
E, = E, =k =k, cB, = cB, = 0. 
r 
E, = yE' = pII = pd cB, =~7(cB’ — BE’)=0 
y TY 43 R3’ Yy N y f z/ ) 
Vz y2 l a) 
iz =% = k—— = k— = 0 C = yí(cB' E! \ = ky8— 
E, = qE! =k m3 = kay = 0, B, y(cB, + BE;) ky 5 
(11.61) 


Let + be the radius vector stretching from the instantaneous location of 
charge q at the time t to the field point P, as seen from the Lab frame S. 
Then 

r = (x — Beti + yj + zk, 
(11.62) 
and note that: 8 x E = Bi x (Ezi + Ej) = BE,k. 


Now the six components of the transformed field written in (11.61) can be 
written compactly as 


q yT 


E = —— —, (11.63a) 
ATEg R’ `) 


cB=8xE. (11.63b) 


Note that Eq. (11.63) can be obtained entirely from (10.9) of Sec. 10.2, if 
we write for the E’ field the expression given in (11.59). 


11.6.2. Pictorial interpretation of the fields 


To get a picture of the E field, we shall obtain two sets of plot, showing (a) 
its angular distribution around the moving charge q; (b) its time variation at 
any given observation point P. For each case, we shall use “Gnuplot” to 
create exact plots corresponding to three values of B, namely f < 1, = 0.4, 
0.95 in the first case, and B = 0.4, 0.8, 0.95 in the second case. 


Angular distribution of the field around the moving charge 

We shall obtain a pictorial interpretation of the E field described by Eq. 
(11.63a). Even though we had taken the field point P on the XY-plane for 
convenience, that restriction is withdrawn for writing the general expression 
for the field in terms of the radius vectors. The E field is radial in both 
frames of reference S' and S (i.e. emanating radially from the instantaneous 
location of the point charge). However, it is the isotropic Coulomb field in 
the rest frame S, whereas angle dependent in the Lab frame S’. 

To picture this we have set up “displaced” Cartesian axes X.Y.2 of the 
Lab frame, the origin A of which coincides with the instantaneous location 
of the charge q at the instant t, shown in Fig. 11.6. We shall denote the 
“displaced” Cartesian coordinates as (*:¥;*) and set up the spherical 
coordinates (r, 0, @), as illustrated in Fig. 11.6. It is obvious that # = + — Set 
whereas the “displaced” y, z coordinates are the same as with respect to the 
original axes. 

The relations (1) between the (*,¥,*) and (r, 0, $), and (2) between the 
Cartesian unit vectors and the spherical unit vectors, are given by the 
following formulas: 


y=rsin@cos¢, er =sin@(cos@j+sin¢k) + cosAi, 
z =r sinĝ sind, ea = cos@(coséj + sin ġk)-— sin ĝi, (11.64) 


č = r cosĝ, eg = — sin ġj + cos ġ k. 


z 


Fig. 11.6. Displaced axes with its origin at the instantaneous location of the moving particle at time 
t. 


The (E, B) field of Eq. (11.63) can be written as 
E(r,0,ġ) = k5 e,, (11.65a) 


cB(r, 8, ġ) = BE sind eg. (11.65b) 

To get a make the picture complete, we need to express R in terms of r, 

0 and p, beginning with Eq. (11.60), exploiting the Cartesian > spherical 
conversion formulas (11.64), and using the relation (11.60): 


R? = PE + y? +27 = 771? (1— p’ sin? 0) . (11.66) 


Therefore, we can rewrite Eq. (11.65) as 


k 
E(r, 6,¢) = ————~ e, (11.67a) 


x2r? (1 — B2 sin? 9)” 


kB sin@ 
Hee 2) = ——_ y (11.67b) 


| 


y?r? (1 — 8? sin? 9°? 
If we write the magnitude of the E field along the YZ-plane (0 = 71/2) as 
E|, and along the direction of motion (8 = 0) as Ej, then 


| k 
E, ==, E=- E/E =. (11.68) 


ny 2p 
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To get a clear picture of the angular distribution of the field, we shall 
plot E on a “unit sphere”, which we define as 


Ta == Vv = / i 

u y ITED 
(11.69) 
so that, from (11.67) E(r,y,@. 0) 


I 
(1e) 
3 


Figure 11.7 shows plots of this field surrounding the moving point 
charge, on a unit sphere, corresponding to three values of p, namely p <& 1, 
B = 0.8, B = 0.95. The E/E) ratios for the above three values of p are 
shown in Table 11.1. The length of the arrow in each figure is proportional 
to the strength of the field. We have plotted the field exactly (i.e. the lengths 
of the arrows exactly) using Gnuplot. 


Table 11.1. E | /E|| for 3 values of £. 


3 E 1/ Ej 
& 1 x a 

0.8 1.66 4.63 
0.95 3.2 32.84 


Time variation at any given observation point P 


The observation point is taken on the Y-axis, at x = 0, y = 1. Substituting 
these values in (11.63a), and setting ;+- = |, c = 1 we get the x- and y- 


components of the E field (its z-component is zero). 


7 Bt _ Y 
apne O 


E,(t) = (11.70) 


[(yBt)? + 1))8/” 


1 
*Edata-151123B.tct’ using 1:2 + 


'Edata-151123B.txt using 1:2:1:2 —= 


'Edata-151123B.txť using 1:2 + 
'Edata-151123B.txt using 1:2:3:4 —— 


Fig. 11.7. Electric field lines on unit sphere corresponding to B « 1, = 0.8, = 0.95. 


Fig. 11.8. (a) and (b) Plots of the Ey, Ex as functions of time, corresponding to p = 0.4, 0.6, 0.8. (c) 
Ion A passing an atomic electron with velocity cf in its passage through matter. 


We have plotted these two fields as functions of t, and corresponding to p = 
0.4, 0.8, 0.95 in Figs. 11.8(a) and 11.8(b). 


11.6.3. Ionizing effect of a heavy ion in its passage through 
matter 


The formula (11.63a), and its time plot shown in Figs. 11.8(a) and 11.8(b) 
have important and interesting applications in nuclear physics. They are 
used to calculate the ionizing effect of a “heavy ion”, e.g. a proton 
(hydrogen ion), a-particle (helium ion), as they pass through matter with 
high velocity, close to the speed of light, and the Range of such particles in 
matter (i.e. distance travelled before being stripped). 

We shall get an appreciation of the “unit radius” defined in (11.69) with 
a realistic example, say a “heavy charged particle” A of positive electric 
charge ze (e.g. a proton with z = 1 or an a-particle for which z = 2, the 
corresponding “light particle” in this case is electron). As the ion moves 
through a piece of matter, e.g. gold foil, or an aluminium strip it ionizes the 
atoms, by pulling out electrons with its electric force. To make things 
simple we take the ionizing particle to be a proton, for which q = e = 1.6 x 


10 °C. Setting — = 9x 10°N.m?/C?, we get r, = 3.8 x 10° m, or, 38 


microns. This “unit radius” is therefore quite large compared to the radius 
of the ionized atoms. We have used it only for the convenience of plotting, 
without attaching any further significance to it. 

The phenomenon we are talking about is a collision of A with e (even 
though they do not physically touch each other), depicted in Fig. 11.8(c). 
The particle A of charge ze moves along the X-axis with velocity v = cf, 
encountering an atomic electron of charge —e, located at P, at an impact 
parameter b. Being heavy (infinitely heavy by assumption), it moves along 
without being deflected from its straight path. As it zips past by the atom it 
exerts a short burst of electric force Fe on all electrons in the atom, lasting 
for a very short time At, compared to the time period of the electronic orbits 
(pre-quantum classical picture). In this short burst, the movement of the 
electron is so small that it can be treated to be stationary. (Obviously, the 
magnetic force F,, will have practically no effect.) As a result, some of the 
electrons in the atom will receive enough energy to be ejected out of the 
atom, thereby ionizing it, while some others will be excited to higher energy 
levels. 

Let us get back to some calculations. It is seen from the plots in Fig. 
11.8 that E, is antisymmetric, hence no effect when integrated over the t- 
axis. In contrast E, is symmetric. When integrated over the t-axis, it will 
give a net area under the curve, @ = dt, where h = -- and ot = 2ti/2 = 2 the 
width of the curve at “half height”, standing for the effective “duration of 


2k 


the impact”. (Reader may verify the estimate.) Then & = =. The passage of 
the heavy particle gives a sharp impulse, i.e. a momentum transfer 
õp = ea = =, in the negative Z-direction (net pulling effect of the 
traversing particles). The corresponding energy transfer is 

_ Ap? 1 2(ze?)? _ 
AE = = —>5 (11.71) 


= 7 -59 919 
2m, (Ameg)* Mev4b* 


The above formula has important application in nuclear physics [36-38]. 

A charged particle, like a-ray, while passing through matter, keeps 
losing its kinetic energy by transferring it to atomic electrons, thereby 
liberating them, and producing electron-ion pairs, and ultimately getting 
stopped after traversing an average distance R, called its range in the given 
material. For example, the 5.4 MeV a-particle [38] emitted by the polonium 


isotope P*!°, carrying energy 5.3 MeV, has a range of about 10 cm in dry 
y y 


air. As it passes through, the material it keeps losing energy, by producing 
electron—ion pairs, which in this case is about 6600 pairs per mm. 

We shall not pursue this matter further. For an exact calculation of these 
effects and an estimation of the Range of the particle, the interested reader 
can look up standard books in Nuclear Physics as cited in the footnote. 


11.7. Exercises for the Reader III 


Problem 11.1. The relative nature of the electric field E and the magnetic 
field B can be seen in an elementary manner as follows. Let a particle of 
electric charge q move along the X-axis with the velocity cB, as seen from a 
Lab frame S, with respect to which there is a uniform magnetic field B = Bj 
and no E field at all. According to the Lorentz force equation (11.30), the 
particle experiences a magnetic force F = qc Bk. With respect to the rest 
frame S' of the particle, however the particle is at rest, and hence, there is 
no magnetic force. The same magnetic force therefore now appears as an 
electric force. Therefore, there is an electric field equal to E = cf Bk, in the 
Z-direction. 

The answers suggested above are only approximate in view of the fact 
that we did not apply the correct force transformation equation (8.55) to 
obtain the force in S’. (a) Apply the suggested correction to obtain the E 
field in S'. (b) Check your answer using the LT equation (11.37) for the 
electromagnetic field under the boost:S(cB, 0, 0)S’. (c) Does the absence of 
“magnetic experience” by the particle in the rest frame imply vanishing B 
field in S'? Check your answer using (11.37) again. 


Problem 11.2. The conclusions of Problem 11.2 raise many paradoxes. A 
“pure” magnetic field in S can be produced by a “neutral” wire carrying 
electric current, as is usually the case for a d.c. circuit. How can such a wire 
cause an electric field in another frame of reference? 

To solve this paradox, consider a line current J flowing through a 
straight wire along the X-direction in a Lorentz frame S. This current can be 
considered to be due to a stream of negative charges (namely, the electrons) 
of charge density Ppeg = -Po with a drift velocity -cv along the X-axis, 
overlying a stationary distribution of positive charges (namely, positive ions 


at the lattice points of the conductor) of equal and opposite charge density 
Ppos = Po. Assuming that the wire has a circular cross-section of radius a, 
the electric current density and the charge density p for this configuration 
are given as follows: 


| pocv.r <a E 
lz = ; Jy Jz U; p Pneg T P pos = U. ( l l. I 2 ) 
U. r>a j 


This configuration produces only B field, but no E field. 

(a) Find the B field at the point P(0, y, 0) on the Y-axis produced by the 
above current. 

Hint: The (E, B) fields at P due to a line charge density A and a line current 
density I, both along the X-axis, are given as 


A 
i B=- 


270 Y 2TEgQ CY 


E = k. (11.73) 


(b) Using (11.37) show that the EM field at P, as seen by an observer S' who 
is moving along the X-axis with velocity cB, is 


> 2? 
VV P Poa” YVp,a* 


220y 2eoy 


(11.74) 


(c) Using LT of the 4-current density shown in (11.72) show that the electric 
current density and charge density with respect to S' are 


t Po, T Sa f f t ! t 
Jz = ; Jy = Jz = 0; "= Preg + P pos = —W PPo. 
D, r>a i 


(11.75) 


(d) With the help of (11.73), and using the current and charge densities 
shown in (11.75) obtain the (E, B) fields at P as measured by S'. Verify that 
you get the same answer as in (11.74). 


Moral: A wire which is neutral with respect to a Lorentz frame S is not 
neutral with respect to another Lorentz frame S', which is why there is an E 
field in S', in addition to the B field due to the electric current. 


Problem 11.3. We saw in the last two problems that a pure magnetic field 
in one Lorentz frame will appear as a combination of magnetic field and 
electric field in another. We shall examine the reverse of this case in this 
problem. Consider a frame S in which the field is purely E, there being no 
B-component. The force on a charge q moving with velocity v = cv is 
therefore F = gE. 

(a) Using the force and velocity transformation formulas (8.55) and (4.23) 
(and their inverses if necessary), show that the force components in the 
frame S', under the boost: S(cB, 0, 0)S’ becomes 


F’ = qlE, — By (Vj Ey + VE,)], 
F; =ay(1+ Bv} )Ey, (11.76) 
F; = qy(1+ 8v} )E,. 
(b) Hence, show with the help of (11.30) that there are both E and B fields 
in S', given as 
E! = Ez, E! = Ey, E}, = Ez, 
cB = 0, cB, = yBE,, cB! = — ^) BEy. 


(c) Check Eq. (11.77) against the general field transformation equation 
(11.37). 


“We imagine the fluid particles to be moving together without random thermal motion. 


Part IV 


4-Momentum Conservation in 
Continuous Media 


Chapter 12 


The Energy Tensor 


12.1. Why Energy Tensor? 


Forces of gravity originate from massive objects like the earth, the Sun, the 
stars — as Newton’s theory of Universal Gravitation tells us. Mass loses the 
pristine purity enjoyed in Newtonian mechanics, because the relativistic 
mass is no longer constant (being a function of velocity), and the rest mass 
is no longer conserved. What replaces mass is energy, thanks to € = me’. 
Energy, in turn, is the time component of a larger entity, namely 4- 
momentum (or, En-Mentum, a term coined in Sec. 8.4). 

For an analytical study of gravitational field in Newtonian formalism, 
its source is expressed in terms of mass density. That density now needs to 
be replaced by some 4-momentum density. However, density being 
something per unit volume — and a unit volume in one frame being not so 
unit in another — the search for an appropriate density leads towards a 4- 
tensor which is generally known as energy—momentum-stress tensor. We 
would however prefer to give it a shorter name — energy tensor. Einstein’s 
General Theory of Relativity traces the source of gravitation to this energy 
tensor. There is another reason for knowing the energy tensor. Even within 
the ambit of special relativity, it is a legitimate urge to write the 
conservation equations for energy and momentum in the appropriate 
language, namely covariantly. One would stumble upon the energy tensor in 
this attempt to write covariant conservation equations, which is also a 
prerequisite to the study of relativistic quantum mechanics and quantum 
field theory. The concept of energy tensor is built upon a more elementary 
three-dimensional base namely, the stress tensor. We had presented a 
detailed exposition of stress tensor in Sec. 5.2, and of Maxwell’s stress 


tensor in particular in Chapter 6. The volume force density f,(r) in matter 


(solid or fluid) at a point r was shown to be the divergence of the stress 
tensor field f (r), in Eq. (5.79), which we are rewriting here: 


f.(r) =F- T. (12.1) 


Fig. 12.1 Lab frame S and comoving frame Sọ in a streamline flow of particles. 


12.2. Minkowski Volume Force Density 


Passing from Euclidean space to Minkowski space-time, one might expect 
the four-dimensional generalization of stress tensor to be a 4-tensor field 
whose 4-divergence would yield Minkowski volume force density. 
However, density functions do not exhibit well-defined transformation 
property unless volume is measured in the comoving frame. We had touched 
on this aspect in Sec. 11.3. We transplant Fig. 11.4 from that section here 
and relabel it as Fig. 12.1. It shows a stream of particles constituting a fluid 
in motion. An infinitesimal volume óV (shown coloured in the figure), 
identified at the event point (x) = (ct, r), contains a collection of fluid 
particles, which together possess a rest mass mo, a quantity of charge ôq, 


and is moving with the velocity u(r, t) with respect to the Lab frame S. The 


corresponding 4-velocity U” and the Lorentz factor T(x), copied from Eqs. 
(11.24) and (11.20) are 


U¥ (xr) =T(2r)(e, u(r)) = [ (c, u1, ue. us), (12.2a) 


1 
T(x) = — a. (12.2b) 


/ u? (x) 
y1- 


Lorentz contraction of the dimension of this box along the direction of 
u, changes its proper volume óV, to the laboratory volume 


ôV = To) >. or, OV, =T(r)dV. (12.3) 


Let the Minkowski force acting on these particles (inside the proper 
volume 6V,) be §#(x) = €} 5F#(x), the corresponding 3-force 6F(x), and the 
power received dII(x). Then from (8.31) 


5F (x )=T(2r) (=. oF | 2)) 
z 


= T(x) ôV (<= z) dF (=) ) 
c él jl 


— SV 1 ôll(x) F(z) 19.4) 
= 0Vo z OV. OY . (12.4) 


We define Minkowski volume 4-force density F (2) to be the Minkowski 
force per unit proper volume — the 3-scalar density w (to be pronounced as 
var-pi) as the power received per unit lab volume, or power density and 3- 
vector density f(x) as the 3-force per unit lab volume, as explained below: 


5F (x) 
Fi r)= lim : (12.5a) 


6V,30 Ô| A 
. cllr) À 
wir)= lim ———., (12.5b) 
V —0 ô| 
OF (2) 
f(r) = lim —_—_ (12.5c) 


It follows from (12.4) and (12.5) that 


F(x) = 8V, f (x), (12.6a) 


where F(z) = (=. f( x)) , (12.6b) 
C 


An example of Minkowski volume 4-force density F (2) and its time and 
space components can be seen in Eq. (12.20a). 

At this point, we shall remind the reader of the convention we adopted 
in Sec. 8.1 and onwards. A 4-vector A can also be written in terms of its 4- 


components {A"; (u = 0, 1, 2, 3)} as A = &} a”. A 4-tensor can also be 
written in terms of its 4 x 4 components {T"; (u, v = 0, 1, 2, 3)} as 
T = @&} T™ &. Sometimes we shall refer to a 4-vector or a 4-tensor as a whole 
“geometrical object” like A and 7. Sometimes we shall write the same 
quantities in terms of its components as A" and T”. 

The 4-stress tensor is now defined to be a symmetric tensor: 


T (x) = &T*"(©)& of rank 2, satisfying the requirement: 


= 
tw 
~= 


We may as well call it Minkowski 4-stress tensor. 

Note from (7.113) the time and space components of the operator are 
given by V = G V), 

In the next few sections, we shall work towards such a tensor for a 
stream of incoherent dust, consisting of electrically charged particles, 
subjected to only electromagnetic forces. 


12.3. Energy and Momentum Conservation in One Voice 


Before proceeding further we shall apply Lorentz force (11.30) to a volume 
distribution of electric charges. Write the Lorentz force 6F on the charge 
content ôq inside an infinitesimal volume óV, by replacing q with p ôV, 
where p is the charge density as defined in Sec. 11.3. Now divide each side 


with óV, and use the definition of volume current density J, as defined in 
(11.19). The Lorentz force density at the event point (x) is now 


oF 
faa = lim = = pE +J x B. (12.8) 
em è 6v 0 OF 


There are two important theorems that are used to state the conservation 
laws involving the electromagnetic forces. We shall state them as two 
theorems, because they follow directly from Maxwell’s equations. 

Consider the same stream of charged particles subjected to 
electromagnetic forces of their own creation. We shall apply energy and 
momentum conservation theorems to this system of particles. 


(A) The energy theorem, also called Poynting’ theorem? is written as 
follows: 


Ow 
E-J+>=-V°S. (12.9) 
é 


We have proved the above theorem in Appendix A.1. 
We interpret the terms appearing in the above equation as follows: 
E - J = work done by the field on the fluid particles per unit volume, 


= rate of change of kinetic energy per unit volume, (12.10a) 


EO 7° 2p . ; 
w = —(E? + cB?) = field energy density. (12.10b) 
> 
S = coc? (E x B) = field energy flux density = Poynting’s vector. 
(12.10c) 


All densities alluded to in the context of energy-momentum theorems 
(12.9) and (12.13) are lab densities. See comments after Eq. (12.4). 

To justify the above interpretation we integrate over a volume V 
bounded by a surface S, and applying Gauss’s theorem, we get 


a ae Ow ee 
I (E-3+ 2) w=-/f V -Sdv 
J. v öt JJ V 

=- JJ S.da. (12.11) 


LHS = rate of change of [mch energy + fld energy] inside V. 

RHS = - outflux of fld energy across S = influx of fld energy across S. 
Therefore, 

rate of change of [mch energy + fld energy] per unit volume = influx 
density of fld energy per unit volume. 

Our interpretation is justified. 


(B) The momentum theorem : 
We proved the following theorem, which follows from Maxwell’s 
equations, as Eq. (6.50c). 
O 


(PE +J x B) + (2 (E9E x B) =V. T (em): (12.12) 
L 


We shall interpret the two terms on the LHS as follows. The (E, B) field 
exerts a force on the existing charge—current distribution according the 
Lorentz force equation. The first term represents this force fem, as in Eq. 
(12.8), equal to the rate of change of the momentum of the particles per unit 
volume represented by P, which we shall refer to as mechanical momentum 
density. 

However, when these fields start changing with time they create a 
propagating em field which carries away energy and momentum. The 
second term should represent the rate of change of this field momentum 
density, to be represented by the symbol g: 


Now we can rewrite Eq. (12.12) as 


OP ög 


— —_ . T _ f 9 i 
Ot a Ot Y T (em): (12.14) 


To justify the above interpretation, we shall integrate (12.14) over a 
volume V bounded by a surface S, and applying Gauss’s theorem, we get: 


avret 7 aLI NO Th m 
(Sf. r) +a (J2 r) = Jj] (¥ Piem) ) d 
7 JJ T em) ` DN(r) da. 


(12.15) 


LHS = The rate of change of [Mch momentum + Fld momentum] inside V. 
RHS = Total em force transmitted across S = Influx of fld momentum 
across S. 

Hence, we interpret Eq. (12.15) as saying that 

Rate of change of [Mch momentum + Fld momentum] per unit volume = 
Influx density of Fld momentum per unit volume. 

Our interpretation is justified. 


Equations (12.9) and (12.14) are two equations expressing conservation 
of energy and momentum, separately. The spirit of relativity will demand 
that they should be integrated into a single equation, unifying conservation 
of energy and momentum as a conservation of En-Mentum (a name for 4- 
momentum coined in Sec. 8.4). As a first step towards this we rewrite Eqs. 
(12.9) and (12.14) in such a way that the left-hand side will represent the 
charged particles and the right-hand side the em field: 


Ow = 
p-I=-[F+v-s], (12.16a) 
Ui 
Ag m 
pE+J x B= -V Tem) (12.16b) 
Cc 
a P 
= — < Iy: Bom (12.16c) 
t 


In the last equation, ®m) is the momentum “outflux density”, equal and 
opposite to momentum “influx density” Fem). See Eq. (6.53). 

Equations (12.16a) and (12.16c) represent the time component and the 
Space components of one 4-vector equality. 


The right side terms can be combined into a 4-vector, which we shall 
define to be the negative 4-divergence of a 4-tensor M, namely the 
Maxwell’s energy 4-tensor. This tensor is an upgradation of the Maxwell’s 
stress 7 (am) defined in Eq. (6.46), except that (-T(.m)), defined as imp forms 
the 3 x 3 core of this upgradation. The 4 x 4 components of this tensor will 
be written as M”. The time and space components of the new 4-vector are 
as follows: 


Ow 


1 a def à m pa 
Time component: — (= +V. s) i VaM”. (12.17a) 
C (4 


-a n P O g z def 7 ak. a « 
Space component: Or +V - Piem) = V Ma": k= 1,2,3. 
í 
: k 
(12.17b) 


The subscript k on the left-hand side implies x, y, z components of the 
vector corresponding to k = 1, 2, 3, respectively. 
It is now easy to identify the 16 components of the Maxwell’s Energy 4- 


tensor M by taking a close look at Eq. (12.17), and recalling the components 
of the operator V, shown in (7.113). Equation (12.17a) yields the 
components of the column 0, and Eq. (12.17b) the components of the 
columns k = 1, 2, 3. Remember that cg = S/c, according to (12.13). For 
further help, see Appendix A.3. Moreover, 
0 1 2 3 
0 w Sz/C Sy/C Szf/Ce 


p 11 12 Ẹ 18 


MY ( \= 1 Sz jc em em em ( 12.18 ) 
2 Sy / C p2 2? $23, 
3 (S/c 631 $32 633 
We have written 611,612... to mean 62%, 62%,..., respectively. Note that MH” 
is symmetric and traceless. 
MrHY = M”?, 


(12.19) 
M¥, = 0. 


Both properties are Lorentz invariant, i.e. same in all inertial frames. 


The left side terms of (12.16) can be combined into another 4-vector, 
namely fadma, the Minkwski force per unit proper volume from the em 
field on the charged particles in the fluid, thereby changing the En-Mentum 
of the particles of the charged fluid: 


-2 1 
f fid+mat(2) = (ŻE J, pE +J x B) (12.20a) 


= 6, Lpr, (12.20b) 
r 


where FH and Ja are, respectively, the electromagnetic field 4-tensor and 
the electric current density 4-vector, both of them defined in Chapter 11, as 
Eqs. (11.36) and (11.25) respectively. 

Equations (12.20a) and (12.20b) are analogous to Eqs. (11.32) and 
(11.34) of Chapter 11. In fact (b) follows from (12.20a) in the same way as 
Eq. (11.25) follows from Eq. (11.32). 

The conservation equations for 4-momentum, appearing disjointedly as 
(12.9) and (12.14), will now join into the following single 4-equation’: 


1 
— FR J, = -V s MP" (x). (12.21) 
P 


12.4. Euler’s (Non-relativistic) Equation of Motion for a 
Perfect Fluid 


Our objective now is to construct the energy tensor of the simplest “closed 
system”. The term “closed” in this context means that the system is self- 
contained in all its dynamical behaviour, i.e., all dynamical processes take 
place due to forces of interaction within the system, there being no scope 
for exchange of energy and momentum with anything outside. The total 
energy and the total momentum of a closed system are therefore fully 
conserved. 

A closed system contains both matter and forces. The only kind of 
classical forces that can receive relativistic treatment are electromagnetic 
forces. Before linking up matter with electromagnetic forces, we shall 
consider an oversimplified model which consists of matter in the form of 
perfect fluid — sometimes also called “classical fluid” — moving under the 


influence of internal and external forces whose origin we need not specify 
at this moment. We shall first lend a non-relativistic treatment to this fluid, 
so that transition to a relativistic formalism becomes smooth in the next 
section. The equation of motion of this perfect fluid is known as Euler’s 
equation. 

By perfect fluid, we mean a fluid which does not offer any viscous 
forces, which as the reader knows, causes shear stresses in the fluid. A 
perfect fluid, whether at rest or in motion, can sustain only normal 
compressive stresses inside, familiarly known as “pressure”. 

Let us consider a fluid in streamline motion as previously illustrated in 
Fig. 12.1. This fluid is characterized by a velocity field u(r, t) and a fluid 
mass density o(r, t), both of which, in general, are unsteady fields, i.e. 
functions of t as well. The divergence of u is called dilatation, a term we 
shall explain with the help of Figs. 12.2(a) and 12.2(b). 


Fig. 12.2 Explaining fluid motion. 


We have shown a stream of fluid in motion, inside of which we have 
marked out a volume V at time t. Since the fluid particles on the surface S of 
V have different velocities u(r, t), the boundary S not only moves with the 
particles lying on it, but also changes to a different shape S' (shown with 


broken line) at the time t + dt. Consequently, V will also change to a 
different volume, say V’. 

Consider a film of fluid particles lying over a tiny area ôa centred at the 
point P. These particles move a tiny distance u dt from P to P’ in time dt. In 
this time, a volume of fluid du flows out from V, crossing the tiny surface 
area da. The volume that flows out is ôv = [u - n]óôa dt. 

There are certain regions of S, say at P, where u - n is positive, and the 
outflux (i.e. volume outflow) is positive. There are some other regions, say, 
at Q, where u - n is negative, and the outflux is negative. The net outflux of 
fluid volume is the surface integral of u over the boundary surface S. This 
can be written as 


w=v'-v=| (u-n) da] a= | ff) (Vu) dr dt, (12.22) 
JJ s JJJ V 


where we have used Gauss’s theorem to convert the surface integral to a 
volume integral. We reduce the finite volume V to an infinitesimal volume 
óV, thereby avoid integration, and get 


d(ôV ) = |[(V - u) ôV] dt. (12.23) 
Therefore‘ 


(12.24) 


In other words, V - u is the rate of change of volume per unit volume — or, 
more compactly dilatation. 

Now we take up equation of motion properly. Consider a fluid element 
consisting of an infinitesimal collection of fluid particles moving along the 
stream (Fig. 12.2(c)). At the time t its centre of mass is located at P where it 
occupies a volume óV. The mass of this element is óm = o(r, t)dV, its 
momentum dp = dmu((r, t)) and the force impressed on it 6F = f(r,t) dV, 
where f(r,t) represents the volume force density. Applying Newton’s second 
law of motion to this fluid element, 


d : 
= (SP) = oF, (12.25a) 


d 


d ; > 
or, — [ôm u(r, t)] = f(r, t) ôV. (12.25b) 


dt 


Note that in the above equation + represents convective derivative 
whose meaning we shall explain with a more general example. Let there 
exist a certain field f(x, y, z, t) in the fluid (e.g., temperature, fluid velocity, 
pressure). The value of this field at the location of the particle changes from 
f(x, y, z, t) to f(x + dx, y + dy, z + dz, t + dt) as the particle moves with 
velocity u from the location r = (x, y, z) at the time t to take up a new 
location r + dr = (x + dx, y + dy, z + dz) at the time t + dt. The net change is 
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df 
= (F) dt. (12.26) 
dt c 


We have attached a subscript “c” to stress that the time rates of the changes 
of physical quantities in motion are given by their Convective Derivatives. 


df _ (df\ asf T PERREN eee 
2 (z). = (u-v+ Z) f(x, y, z.t). (12.27) 


Using Eqs. (12.24) and (12.27), we establish a few relations for future 
reference: 

d. .. do., dV 

g A ] = a Temi 
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= {= +(u- vo} +o {V . a)l ôV (12.28a) 


Oa i 
= Z +V. (ou)| ôV, (12.28b) 
ti 


where ø represents any density function, of which the mass density is 
particular example. 

In N.R. physics mass is conserved. Consider the two terms in the first 
equality in (12.28a). The first term, if positive, means increase in mass in 
dV due to density fluctuation. The second term, if positive, means increase 
in mass due to volume fluctuation. However, both of them cannot be 
positive. Increase in one term is nullified by decrease in the other. Together 
they represent zero change. We get back the mass conservation equation, 
known as continuity equation. 

Oa 


Ei ôV] =U => or + v. (ou) = 0. (12.29) 
di é 


We shall convert Eq. (12.28) to a momentum equation. Replace the 
scalar density o with the density of the x-component of momentum ou, in 
the above equation, and get 

d O(ou,, ) 


7 (ous) 6V] = | a +V (cucu | ôV. (12.30) 


The above relation holds for all the three components ux, uy, uz. Multiplying 
the components with ex, ey, ez and adding them together, we get 


l a D > 
zlou ô| ] = B ou)+V- (oun)| ô| (12.31) 


dt 


Going back to Eq. (12.25), noting that óm u(r, t) = ou óV and using (12.31) 
we get the general equation of motion for the fluid: 


Oo k ae 
a ou I+ V-(ouu) =f. (12.32) 
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Equation (12.32) is the general equation of motion of a fluid, to be referred 
to as the Euler’s equation. 


12.5. Relativistic Equation of Motion for a Continuous 
Incoherent Media 


We shall upgrade the E? version of the fluid equation of motion (12.32) to 
Mt. The starting point of the former was (12.25a). The starting point of the 
latter will be the M4 version of this equation, i.e. Eq. (8.28) in which we set 


P — 5P and F> SÈ. 


{6P 
(ò ) => 
a! a ee (12.33) 


dr 


— 
Here P is the 4-momentum of the mass content of the same fluid volume 


óV considered in Sec. 12.4, and SÈ is the Minkowski force on this volume. 
We shall write the left and the right side of the above equation 


= ; d(oP ) d( õp”) : 
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Then the EoM (12.33) can be written in the form: 
ac = f(x) oV., (12.35a) 
dr i 
dôp” — = . 
or = f#(x) dV. (12.35b) 


dt 


To go from (12.35a) to (12.35b), we divided each side with T, and recalled 
Eqs. (4.16) and (12.3). 

If we make use of the expression for Fi given in (12.6), the EoM takes 
the form: 


(12.36) 


Equation (12.36) will be the backbone of our arguments. It states that 
the rate of change of 4-momentum of an infinitesimal volume óV of 
particles at the event point (x) is equal to the total 4-force acting on this 
volume. We shall convert it to a beautiful form in Eq. (12.44) 


At this point let us be aware that mass is not conserved in relativistic 
mechanics. Mass conservation is violated, even if infinitesimally, in all real 
situations. Mass of a system changes when chemical reactions take place, 
when atoms absorb or emit light, when a gas expands or is compressed. 
Even for the perfect fluid, whose dynamics was given a relatively simple 
non-relativistic treatment in Sec. 12.4, its mass is continuously changing 
because of the work being done by fluid pressure. This effect has to be 
taken into consideration. 

To make our task manageable, we shall think of a system of particles 
forming a tenuous fluid in motion. The constituent particles move along 
streamlines without a bond between them, i.e. no collision occurs. Also, the 
constituent particles — atoms, molecules, nuclei, electrons — whatever 
they may be, remain in their original ground states through the dynamical 
processes, and, hence, do not emit or absorb light, so that their rest masses 
do not change. The particles are charged, and the electromagnetic field 
created by their charges determine their equation of motion. 

Let Fig. 12.1 represent a segment of this flowing fluid. An infinitesimal 
volume óV of this fluid, at the event point (x), possesses a rest mass mọ, 
which is the sum of the rest masses of all the constituent particles inside dV. 
That is, dm = 37°’, moi, Where SN is the number of particles inside 6V and 


Moi is the rest mass of the ith particle in this infinitesimal collection. Let 0, 
stand for proper density of rest mass, which we define as 


olx) = lim —, (12.37) 


where óV, is the proper volume of the above collection, i.e. volume 
measured in the instantaneous rest frame. In contrast to do, we use another 
symbol o to denote density of relativistic mass in the observer’s frame S. 
Seen from the observer’s frame, the above collection of SN particles are 
now confined within a smaller volume óV = 6V,/T and the relativistic mass 


of this collection is óm = ['émo. Therefore, 


, om . r dmo 2 IN ln nay 
olzx)= lim — = lim ——— = [4o (2). (12.38) 
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We shall work out the equations of motion of the energy and momentum 
content of the volume 6V. The relativistic mass of this volume is 


ôm = a(x)dV. (12.39) 


Therefore, according to formulas (8.19) and (8.18(b), 8.18(c)) of Sec. 8.4, 


the 4-momentum of the mass content within this volume is dp! = (Sp°,6p) 
where 


dp’ = ôme = (a ôV ) c, 

dp = ĝm u = (o ôV ) u. (12.40) 
Let us now go back to the equation of motion (12.36). We shall expand the 
left-hand side corresponding to u = 0, using the time component of p” as 


given in (12.40). With some help from (12.28): 
doy” = cH (0 6V) =C = Ly. (ou) ôV, (12.41) 
and corresponding to p = i = 1, 2, 3 in a similar way with help from (12.31): 
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We shall combine the above two equations to obtain the following 
identity: 
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(12.43) 


We have used Eq. (8.7) which gives the 4-velocity U” = T (c, u), in which u 
is the velocity of a particle at the event point (x). 
Going back to (12.36) we rewrite the same EoM in the compact form: 


(12.44) 


which is the EoM for a system of incoherent dust subjected to 4-force 


=> —> 
f = Cy 


f* per unit proper volume. 

Much of our discussions to follow will be based on Eq. (12.44). 

The compact four-dimensional EoM (12.44) has the following (times— 
space) components: 


Pi 2 7) WHeau) OF 
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The time part is the energy equation, and the space part the momentum 
equation. 
We now define the energy tensor of the incoherent fluid (also called 


incoherent dust) as" 


D” og UHUY. (12.46) 


It is now very easy to identify the 4 x 4 components of De”: 
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The EoM, written as (12.44), takes the beautiful comprehensive form: 


Ver = f. (12.48) 


12.6. Energy Tensor for a System of Charged Incoherent Fluid 


We prepared the groundwork for this section in Sec. 12.3, in particular 
through Eq. (12.21). Before proceeding further we shall recognize the 
following two volume 4-force densities. 
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= 4-force per unit proper volume from the em fld 


on the particles in the dust, 
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4-force per unit proper volume from the particles in the dust 


on the em fid. 
Let us now understand the effect of above two 4-force densities. 
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We can now go back to (12.21) and rewrite the 4-Momentum 
conservation equation as*: 


Taciau = = see (12.51) 


The above equation represents a generalization of Newton’s third law of 
motion for the 3-forces of action and reaction to the 4-forces of action and 
reaction between a charged fluid media and its own electromagnetic field. 

The EoM of the charged dust is given by Eq. (12.48), in which the 
“force” f is now the electromagnetic force on matter, i.e. ffj mat = fia, aS 
given in (12.20), exerted by the em field originating from the charge-current 
density J“ present in the matter itself. 

The reader may re-read the statement following Eq. (12.36) to clear a 
possible doubt. The quantity Faa+mat(2)5V is the 4-force exerted on matter 
inside the elementary lab volume óV, by the charge—current distribution 
residing on the rest of the volume V — óV, outside óV. In this sense, this 4- 


force is an “external” 4-force (not a “self force”) determining the fate of the 
matter inside 6V. The quantity (VoP""(2))°V is the time rate of change of the 
4-momentum of the particles in this volume, according to Eqs. (12.46) and 
(12.43). 

The EoM (12.48) is now written as 


7 PAlla — FH 
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But, Jad—>mat == Jmat—+fd = Va M(x), (12.52) 
by (12.51) and (12.50b) . 
Hence, V,D™ (xr) = -Va M (z). 


The system consisting of the matter (represented by p™) and its own em 
fld (represented by M“") is now a closed system. Its energy tensor is 


satisfying 


In Newton’s theory of gravitation a massive star, or a massive planet, is the 
source of gravitation. In Einstein’s General Theory of Relativity mass is 
replaced by energy. However, energy itself has no respectable status, 
because energy is the time component of 4-momentum. Hence, energy is 
replaced by 4-momentum, and energy density (analogous to mass density) 
by the energy tensor, which is loosely the density of 4-momentum. Since a 
star is an isolated object, its energy tensor must have zero 4-divergence. 
Equation (12.53) gives the simplest example of such an energy tensor, and 
Eq. (12.54) tells us the desirable property of such a source of gravitation. 


12.7. Energy Tensor of a Closed System 


By the defining property (see first paragraph of Sec. 12.4), the energy and 
momentum of a closed system are strictly conserved. This conservation 
statement is succinctly expressed in the mathematical equation: 


where T™(x) is the energy tensor of the system at the event point (x). 
Equation (12.54) gives an example of such a tensor. 

Equation (12.55) represents four equations, of which the p = 0 
component expresses energy conservation, and the p = 1, 2, 3 components 
the momentum conservation, since energy divided by c, and the momentum 
3-components, together, constitute the 4-momentum p”. 

Before we proceed towards the construction of the energy tensor of a 
perfect fluid (which we take up in the next section), Eq. (12.55), and its full 
import will help us in our mission. 

To see the energy part of (12.55) we set p = 0. Now multiplying either 
side with c and expanding we obtain: 


This becomes the energy conservation equation, conforming to the format 
of the continuity equation shown way back in Eq. (11.14), by identifying: 


where w stands for the energy density, and the 3-vector S for the energy flux 
density. Equation (12.56) can now be rewritten in the standard form of the 
continuity equation: 


In a similar manner we set p = 1, 2, 3 to view the momentum 
conservation part of the equation (12.55), which yields: 


As before, the above equation becomes the conservation equation for the ith 
component of momentum by identifying: 


where g(x) = (gtx), g*(x), g°(x)) stands for the momentum density vector, 
and T(x) = (T(x), T?!(x), T(x) for the flux density vector for the ith 
component of momentum. Hence, we can rewrite Eq. (12.59) as 


Note that T, which stands for the kth Cartesian component of the 
momentum flux density vector TÍ, is the (ki)-element of the energy tensor 
T(x). If we multiply either side of the above equation with e; and sum over 
i, and make use of Eq. (5.68), the momentum conservation is expressed 
compactly in the form: 


where is the momentum flux density 3-tensor, and comprises the space- 
space components of the energy tensor. 
The energy tensor is strictly symmetric, i.e. 


The symmetricity of the space-space components is related to the 
conservation of angular momentum (see [34], p. 170). The symmetricity 
between the space-time, and the time-space components is hidden in 
(12.57) and (12.60), as we shall now show. 

The mass energy equivalence E = mc“ associates with every energy 
density w(x) an effective mass density o(x) such that 


2 


The energy flux density S(x) suggests an equivalent transport density v(x) 
such that 


In the case of the electromagnetic field, this transport velocity is the 
same as the velocity of light. In the case of a perfect fluid in which pressure 
exists, it is however different from the velocity u(x) of the fluid, as we shall 
see in the next section. 


Also, momentum p and the effective mass m are connected by the 
relation p = mv. This is true for relativistic and non-relativistic systems. It is 
obviously valid for a particle if we take m as the relativistic mass. It holds 
for a photon if we interpret E/c* as the effective mass of the photon. 
Extending the above mass-momentum relationship to the corresponding 
densities, we get Use of (12.65) now leads to the following relationship 
between g and S: 


It now follows from (12.57) and (12.60) that T°* = T*®, 
To summarize: 


12.8. Energy Tensor of a Perfect Fluid 


We shall take up another simplified model of a closed system to shed 
further light on the meaning of energy tensor. We shall consider a gas, as in 
the previous section, but drop the assumption of tenuousness, so that the gas 
particles can be imagined to interact through collision, thereby giving rise 
to fluid pressure. 

When a fluid is compressed, its energy increases. Therefore, the fluid 
under pressure has extra elastic energy which must show up as extra mass 
density. In order to keep the discussion simple, we shall assume (a) that the 
fluid is perfect, so that the stress field inside has the simplest possible form, 
as given by Eq. (5.81), and (b) the heat exchange can be ignored so that the 
dynamical processes can be treated as adiabatic. We shall present a 
simplified treatment for this case. For a rigorous treatment, with a proper 
analysis of the stress tensor inside the fluid, the reader may look up [34, 
Sec. 6.6]. 

As in Sec. 12.5, consider the change in the energy-momentum content 
of a given number óN of particles confined within a variable volume ôV, as 
these particles move in bulk as a stream. The rest mass of this volume is 
0,(x)6V,, whereas its relativistic mass in the observer’s frame (which 


includes the mass I'%0,(x)6V as given in Eq. (12.38) plus some extra mass 


generated in this volume due to the work done by the pressure force) is 
0(x)6V. 

As the fluid moves, the energy content of óV changes due to the work 
done along its surface by the surface force. Consider an area element óa at a 
point P on the surface, as in Fig. 12.2(a). The force on this area is -pn óa, 
where n is an outward normal. The surface element is moving with the 
velocity u under the pressure force. The work being done by this force is 
-pn - u ôa per unit time. The net work on the closed surface per unit time is 
the surface integral of this elementary work, which we shall convert into 
volume integral using Gauss’s theorem, to obtain the rate at which the 
energy of the volume V is increasing per unit time. 


Taking the above volume V to be very small, equal to óV, and dividing 
both sides of the equation by óV we get the rate of change of energy per unit 
volume per unit time (same as what we called power density in Sec. 12.2) 
as 


Interestingly, we can expand the above expression into two parts: 


Equation (12.70a) shows how the pressure volume force —Vp, as given by 
Eq. (5.82), alters its kinetic energy per unit volume per unit time, and Eq. 
(12.70b) shows how the expansion of the gas against the external pressure 
contributes an elastic energy (same as the potential energy) per unit volume 
per unit time. 

We now write the energy equation, noting that oc;? is the energy 
density. The rate of change of energy inside the volume óV is equal to . 
Hence, by (12.69) 


By Eq. (12.28) 


Hence, the time component of the EoM: 


Equation (12.73) represents the u = O component of the fundamental 
conservation equation (12.55) and is equivalent to (12.56), and then to 
(12.58). Comparing, we identify the T? components of the energy tensor: 


Before writing the momentum equation of motion, we shall need an 
expression for the momentum density g in the frame S. This follows straight 
from Eqs. (12.66) and (12.74b). 


The momentum content of the volume ôV is therefore, 


To obtain the rate of change of this momentum we shall use Eq. (12.31), 
replace o with 


The only force acting on this fluid element, as already assumed, is the 
pressure force, namely, 


The EoM for momentum is 


Therefore, from (12.77)—(12.79) 


Equation (12.80) is to be identified with (12.59). Hence, the remaining 
components of the energy tensor: 


The components of T” shown in (12.74) and (12.81) do not present a 
covariant expression, because the quantities on the right-hand side are not 
4-vectors or 4-scalars. We shall correct this defect. 

Let us go back to the IRS S, in which the components of TY” form the 
following diagonal components, by setting u = 0 in (12.74) and (12.81): 


where pù is the pressure in Sọ. According to Corollary #3 in Sec. 8.7.4, the 
pressure p is Lorentz invariant, i.e. 


The relativistic mass density o(x) has to be expressed in terms of 09(x). 
For this purpose, note that 


where represents the Lorentz transformation matrix corresponding to the 
boost: from the rest frame S, to the observer’s frame S. Replacing fp by -f 


in Eq. (3.20) of Sec. 3.2 we get the relevant components of the LT: 


Substituting (12.85) in (12.84), and recognizing the T°° component from 
Eq. (12.74a) we get the following expression for o(x): 


Hence, 


Substituting (12.83) and (12.86) in (12.74) and (12.81), we transform T™ to 
the following form: 


It is now obvious from Eq. (12.87) that the energy tensor for a perfect 
fluid has the following compact covariant expression: 


where §! is the metric tensor and U(x) is the 4-velocity field of the fluid. 
Note that we have dropped the subscript “,” under p, because p = po. 
Compare the energy tensor of the perfect fluid with that of the 
incoherent dust shown in (12.46), and note the difference between the two 
energy tensors: 


The extra mass—energy—momentum represented by 67” is entirely due to 
fluid pressure. 


“See [13, pp. 258-261]. 
OSee [33, p. 152]. 
“See [34, p.136.] 
dSee [34, p. 141]. 
“See [33, p. 152]. 


Appendices 


A.1. Energy Conservation in Electromagnetic Field 


We shall prove Poynting’s theorem as given in Eq. (12.9) using Maxwell’s 
equation (11.39). The electric current density appears in Eq. (11.39b). 
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A.2. Examples of Lowering and Raising an Index 


Example 1. 
yo 1 0 0 0 
Ey V! 0-1 0 0 
=V” gop = 
r ai y2 0 0 -1 0 
y3 0 0 0 =] 


= (V?, -V £, -v?, -v3) (A.1) 


Lowering or Raising = No change in the time component. Sign change in 
the space component. 


Example 2. Let 


be a contravariant 4-tensor. We shall lower only the first index p, then only 
the second index v, then both indices p, v. 


First index lowered = No change in row 0. Sign change in rows 1, 2, 3. 


Second index lowered = No change in col 0. Sign change in cols 1, 2, 3. 


Both indices lowered = No change in {00, kj, jk} components. Sign change 
in {0k, k0} components. 


Example 3. Trace of the contravariant tensor F”” is defined as sum over p. 
Going back to (A.5), 


A.3. Components of Maxwell’s Stress 3-Tensor and Maxwell’s 
4-Tensor, and Their Traces 


Maxwell’s 3-tensor was written in a short form in Eq. (6.46). We shall now 
write down the 3 x 3 components of this tensor. The reader should verify 
them. 


We can now write the trace of the Maxwell 3-tensor: 


Maxwell’s 4-tensor was defined by Eq. (12.17). We shall use the same 
equation to identify all the components of M””(x). 


In the following, we shall write to mean respectively. 


We can now write all the 4 x 4 components of M#"(x). 


Because of Eq. (12.13), and the tensor is symmetric. 
The trace of the Maxwell’s 4-tensor follows from (A.7) and (A.9). 


B.1. Useful Integrals 


We shall write derive the values of some integrals required in this book. The 
integrands of all the integrals will have in their denominators 
integer/halfinteger powers of the expression (r° + a? — 2ra cos 0), the 
integration variable will be 0, and the range of integration [0, 7]. We shall 
do some preliminary work by changing the variable of integration from 6 to 
n, accompanied by the change of the limits of integration, and conversion of 
the numerators for the first two cases: 


Direct Evaluation 


Using the above conversions hints it should not be difficult for the reader to 
establish the following integrals: 


Integral #1 


Integral #2 


Evaluation using Maxima 


We have evaluated the following three integrals, using Maxima (version 
5.13.0). We shall first write down the values of the integrals, and then show 
the commands used in Maxima to obtain these results. Let us write 


Integral #3 


Integral #4 


Integral #5 


Maxima Commands, Inputs and Outputs 


We shall write the interactive commands and prompts between the user and 
the Maxima so that the reader can verify the values of the integrals #4 and 
#5. Note the following: 


1. Some output lines (e.g. %05, %06 in Ex. #4) are spread over two lines in 
which the first line contains the “indices”, e.g. “to the power 2”. These 
indices get displaced and detached from the base when the output is 
copied into any text file. To avoid this anomaly, we have brought them to 
one line using mathematical mode. 

2. If the output is an expression of a definite integral, it is spread over 
seven lines (e.g. as in %09 in Example #4), and the integral sign 


becomes unintelligible when copied. We have replaced these outputs and 
other outputs that appear too long and complicated with All outputs 
except the final one are non-essential. 


Input/Output for Integral #4 


To simplify the last output (%011), set 


and get for the first case and 0 for the second. 


Input/Output for Integral #5 


Epilogue 


In the high reaches of the Himalayas, the ice of the Gangotri glacier melts, 
and as it descends down the mountains into the plains, is swelled by the 
tributaries and groundwaters to form a mighty river, the Ganga. But Ganga 
is not just a river. It is also a concept, one of the pillars of a Faith that has 
moulded a civilization. It could be an edifying experience, intellectually and 
physically, to journey down the course of the river, from the mountain to the 
sea, as it meanders past holy cities and historical monuments. 

Nestled in the lofty heights of scientific analysis and philosophical 
ruminations, occasionally great and epoch making theories are born. Some 
of them cause deluge of such a magnitude as to shake everything on their 
way, altering the course of history, uprooting the foundations of 
conventional notions and concepts, and building on the ruins of destruction 
another edifice of much greater vigour and beauty. It can be an edifying 
experience to make an intellectual voyage downstream of the deluge and 
thrill at the profound upheavals that one intellectual feat of a human mind 
could bring about. 

Sometimes the impact is noticeable to all sections of society, as for 
example in the case of Faraday’s discovery of Electromagnetic Induction, 
leading to widespread use of electricity. More often, however, the impact is 
of a more subtle and esoteric nature, comprehensible to the avowed 
practitioners of the discipline. Newton’s formulation of Classical Mechanics 
and Gravitation falls in the latter category. It replaced the decrepit 
Aristotelian beliefs with a system of analysis whose mind-boggling 
profundity, universality, power and simplicity laid the foundations of 
physics. Much of our studies in classical physics is a journey through the 
course of a river that finds its source in the laws of motion and gravitation 
as conceived by Newton, and streams through the macroworld of planets 
and satellites, the earthly world of missiles and locomotives, down to the 
microworld of atoms and molecules, as if to unify the three worlds in a 


single grand design. Even though the advent of Quantum Mechanics curbed 
the role of Newton in the microworld, and General Theory of Relativity 
redefined cosmological concepts, they are not to be regarded as an 
abandonment of Newtonian ideas, but a refinement of the same. The 
unifying spirit of Newton has been the driving force behind the pursuits of 
physics among all succeeding generations. 

In this book, we tried to trace the course of another great river, the 
Theory of Relativity, which finds its origin in the belief that all frames of 
reference are equal. Seen in isolation, this principle is just an extrapolation 
of the egalitarian value system from society to reference frames, without 
any visible impact. However, in juxtaposition with the laws of 
electrodynamics, this innocuous hypothesis makes startling revelations 
about the relative nature of space and time, leading to paradoxes of time 
dilation and length contraction (Chapter 2). An immediate fallout is Lorentz 
transformation, providing uniform prescription for the conversion of the 
time and space coordinates of an event between frames of reference 
(Chapter 3). 

The same Lorentz transformation does not seem to fit in within the 
scheme of Newtonian Mechanics unless momentum is redefined and the 
energy expression is modified. But this does not seem possible without 
recognizing mass—energy equivalence. The innocuous hypothesis then 
opens up an entirely new world where matter is annihilated to liberate 
energy. A new branch of physics is born — the physics of Atomic Nuclei 
and the physics of Elementary Particles with awesome forebodings of a 
nuclear holocaust (Chapter 4). 

The mathematical language in which the laws of physics, in particular, 
the conservation of Energy and Momentum are expressed involve an 
exposition to the world of tensors. Maxwell’s stress tensor gives a beautiful 
illustration of the stress that exists in empty space, in the vicinity of electric 
charges and currents, or anywhere else where electromagnetic field exists 
(Part II). 

The vision of relativity is incomplete unless physical quantities are 
looked upon as four-dimensional geometrical objects having one time and 
three space components. Realization of this four-dimensional world, called 
space-time, begins with the construction of the Minkowski metric giving the 
expression for a line element stretching between two events. The vision of 
this four-dimensional scheme in the workings of the universe is then 


consummated through the Principle of Covariance that declares that the 
mathematical expressions of the laws of physics must be an equation 
between two 4-tensors of the same rank and type. This leads to the 
unification of energy and momentum within a 4-vector, the 4-momentum 
(also called or En-Mentum in this book), and Minkowski’s equation of 
motion (EoM) in terms of another 4-vector, the 4-force (also called Pow- 
Force in this book). An illustration of this new scheme is provided by the 
relativistic rocket, and the covariant equations of classical electrodynamics 
(Part III). 

Further example of the Principle of Covariance is provided by energy 
tensor, which not only illustrates the succinctness, elegance and the power 
of the energy-momentum (i.e. 4-momentum) conservation equations but 
also builds a bridge to the General Theory of Relativity, which is also the 
Relativistic Theory of Gravitation (Part IV). 

Within the limited scope of this book we have thus taken the reader on a 
guided tour of the Relativity Valley. Originating from a modest creed, 
nurtured by the Laws of Classical Electrodynamics, swelled by the 
conservation laws of energy-momentum, embellished by the demands of 
Covariance, the relativity theory turns into a mighty river providing 
nourishment to all branches of modern physics. Midway downstream of our 
journey we had to call it a day as the river was about to enter a deep canyon 
of exquisite beauty — the domain of the General Theory of Relativity. 
Some of the readers who might have received an inspiration from their visit 
to the temple of energy tensor for a rafting expedition down these turbulent 
rapids may now feel dejected at this last moment betrayal of their captain. 
However, great expedition requires not only great courage, but also 
necessary equipments and above all, arduous preparations and training to 
face unexpected challenges in a land of adventure. It is now time to undergo 
this training from a master of tensor calculus. We shall resume our voyage, 
hopefully in future, after period of interlude to give some sobering time for 
the impatient expeditioner. 
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